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METHOD FOR SELECTING RECOMBINASE VARIANTS 
WITH ALTERED SPECIFICITY 
BACKGROUND OF THE IN VENTION 
Rccombihases, intcgrases and resolvascs (collectively referred to herein 
10 5 as recombinascs) mediate the site-specific recombination of ON A. These 

recombinases were first identified in phage that integrate into host 
chromosomes. Such integration allows the phage to remain latent in the ceil as 
a projphage. 

Site-specific recombinases catalyze conservative DNA rearrangements 
10 at ^ific target sequences. The 38 kOa Cre rccombinase (cycUzation 

recombination), derived from the bacteriophage PI , is a well characterized and 
20 widely used enzyme of tiie Integrase family (reviewed by Sauer, Methods, 

14:381-392 (1998)). Cre plays two essential roles in tiie life cycle of PI: First, 
it provides a host-independent mechanism for PI' s genome cyclization after 
15 infection, wWch can be important when the recombination system of the host is 
compromised. Second, Cre resolves dimerized PI prophage plasmids to 
guarantee proper segregation during cell division. 

Cre acts on a 34 bp sequence located on both ends of tiie linear PI 
genome, that is called lox? (locus of crossover of PI; Sternberg and Hamilton, 
20 J. MoL Biol, 150:467-486 (1981)), loxV consists of two 13 bp inverted repeats 
flanking a noii-palindromic 8 bp core tiiat defines the assigned direction of tiie 
sequence (as shown on tiie upper part of Figure 1). Depending on tiiis direction, 
3^ iccombmation catalyzed by Cre leads to excision of insatiori of DNA flanked 

by iGx? sites orientated in the same direction (indicated by /oxP^). but leads to 
25 inversion when oriented in the opposite direction (Figure 2). 

In general, Cre-recombination involves die following four events: (i) 
DNA binding, {ii) synapsis (as defined below), {iii) cleavage, and (iv) strand 
exdiangc. To study tiiis process in greater detail, mutants defective for each 
st^ have been isolated using several screening procedures (Wierzbicki et al., 1 
^ 30 Mol Biol, 195:785-794 (1987)). In addition, tiie crystal structure of Cre 

complexed witii an artificial suicide substrate has been lecentiy resolved, 
providing additional insights into site-specific recombination (Quo et al., 
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Nature, 389:40-46 (1997)). From these studies, the following has been 
proposed: Four interacting Cre molecules are necessary for recombination 
between two lox sites, with each enzNinc bin^ng one invexted repeat plus the 
two outermost bp of the non-symmetric core region (DNA binding). This leads 
5 to the formation of a clamp, allowing DNA contacts in the major, as well as in 
the minor groove. In the step referred to as synapsis, the two lox sites with the 
bound Cre molecules, are aligned in parallel leading to an approximate 100* 
bending of the DNA. In Ae following step of strand cleavage, one of the two 
Cre molecules on each lox site catises a staggered cut in the core region, as 
1 0 indicated by the vertical arrows in Figure 1 . This leads to a 6 bp 5' overhang 

and a covalent 3' phosphotyrosine linkage between the catalytic residue tyrosine 
324 of Cre and the guanine (position 4) at the cleaving site of lox?. The created 
phosphotyrosine intennediate is thought to provide the energy for the reaction, 
thereby explaining why Cre docs not lequire an external aiergy source. In the 
15 next step, the first strand is exchanged between the two nicked lox sites, 

creating an intermediate, named HoUiday structure (Sigal and Alberts, J. Mai 
Biol, 71:789-793 (1972)). Of note, this first strand exchange is asymmetric, 
since the bottom strand (Figure I) is always exchanged first (Hoess et al., Proc. 
Natl Acad, Set USA, 84:6840-6844 (1987)). During the final step, the second 
20 strand is exchanged and Cre released from its substrate. 

Because of the simplicity and the ability of Cre to function in yeast and 
mammalian cells (Sauer, B., MoL Cell Biol, 7:2087-2096 (1987); Sauer and 
Henderson, Froc. Natl Acad. Set USA, 85:5166-5170 (1988), Sauer and 
Henderson, Nucl Acids Res., 17:147-161 (1989). and Sauer and Henderson, The 
25 New Biologist, 2:441 -449 (1 990), Cre assisted site-specific recombination has 
become an important tool for efficient, specific, and conditional manipulations 
of eukaryotic genomes (Lakso et al., Froc. Natl Acad. ScL USA, 89:6232-6236 
(1992)): Kilby etal.. Genet, 9:413-421 (1993); Sauer, B., Metk enzymol, 
225:890-900 (1993); Ktihn et al., Science, 269:1427-1429 (1995); Metzger et 
30 al.. Fro. Natl Acad ScL USA, 92:6991 -6995 (1995). 

However, there arc some inconveniences for the successfiil use of Cre- 
related technologies, that include the following: (0 lox sites need to be 
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introduced by homologous recombinaiion at the desired region into the genome 
before Crc can be used, (»") the frequency of correct site-specific recombination 
due to Cre expression is not 100%, and consequently, (liO selectable maricers 
are necessary in most stratc^es involving Crc for genome manipulation in 
10 5 higher eukaryotes. TTiese markers. e.g. neo or TK, may introduce problems in 

subsequent studies, particular in those related to animal development. The 
number of available selectable markers that can be used in limited also. 
Additional site-specific recombinases that also function efficiently in eukaryotic 
systems, bm recognize Afferent sites from te would be helpful. Similar 
10 inconveniences limit the usefulness of other recombinases. 

Therefore, it is an object of the present invention to provide a method of 
20 identifying variant recombinases that can mediate recombination between 

variant recombination sites. 

It is another object of the present invention to provide variant 
15 recombinases that can mediate recombination betwewi variant recombination 
sites. 

It is another object of the present invention to provide a method of 
recombining nucleic acid molecules in vitro and in vivo. 

It is another object of the present invention to provide Cre variants that 
20 recognize variant recombination sites. 

BRIEF SUMMARY OF THE INVENTION 
Disclosed is a method for identifying variant forms of recombinases that 
can mediate recombination between variant recombination sites. The me&od 
involves producing mutant recombinases and testing the mutant recombinases 
25 with specially designed constructs. The constructs contain variant 

recombination sites that are not recognized by non-mutant recombinase but will 
undergo recombination in the presence of a mutant recombinase with altered 
spedficity. Recombination at the variant recombination sites can be monitored 
or detected by any suitable means. It is preferred that recombination is detected 
30 by scfcemi^ or selection based on the expression or lack of expression of a 
reporter gene. This can be accompli^ed by using constructs containing a 
reportCT gene associated with the variant recombination sites such that the 
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5 reporter gene is rcananged or deleted, or a spacer sequence interrupting the 

reporter gene is rearranged or deleted, as a result of recombination at the 
recombination sites. Recombination of such constructs will result in a loss of 
expression of the reporter gene, where the construct contained a functional 
10 5 reporter gene, or in a gMh in expression of the reporter gene, where the 

construct contained a non-functional reporter gene. 

The disclosed method also involves determining whether a variant 
recombinase retains the ability to mediate recombination at recombination sites 
recognized by non-variant recombinase. This can be accomplished by using 
10 constructs containing recombination sites recognizc<i by non-variant 

recombinase. Recombination at these recombination sites can be monitored or 
20 detected by any suitable means. It is preferred that recombination is detected by 

screening or selection based on the expression or lack of expression of a 
reporter gene. This can be accomplished by using constructs contaimng a 
15 reporter gene associated vwth the recombination sites recognized by non-variant 
recombinase such that the reporter gene is rearranged or deleted, or a spacer 
sequence interrupting the reporter gene is rearranged or deleted, as a result of 
recombination at the recombination sites. Recombination of such constnicts 
will result in aloss of expression of the reporter gene, where the construct 
20 contained a functional reporter gene, or in a gain in expression of the reporter 
gene, where the construct contained a non-functional reporter gene. 

When variant recombinases are tested for activity on botii variant 
^ recombination sites and recombination sites recognized by non-variant 

recombinase in the same system or at the same time, it is preferred that two 
25 different reporter genes, which can be separately detected or monitored, be 
used. In this case, the first reporter gene can be associated witii the variant 
recombination sites and the second reporter gene can be associated vrith 
r^mbination sites recognized by non-variant recombinase. 

Recomtenation between two recombination sites reqtures (1) that the 
30 recombinase recognize the sites as recombination sites, and (2) that the 

sequences of the two sites is sufficienUy similar. It has been discovered that 
recombination between two recombination sites (both of which are recognized 
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g by a recombinase) can be substantially reduced or prevented by using different 

compatibility sequences for the recombination sites (the recognition sequences 
can also differ if the recombinasc can recognize different sequences). Thus, it is 
also preferred that the variant recombination sites be made incompatible with 
10 5 the recombination sites recognized by non-variant recombinase by using 

different compatibility sequences for the two sets of recombination sites. 

Compatibility sequences in a recombination site are those sequences in 
the recombination site, other than the sequences required for recognition of the 
site by the recombinase, that must be similar in a pair of recombination sites for 
10 recombination to occur between them. Many recombination sites contain 
repeats of a characteristic sequence separated by spacer sequences. In such 
20 recombination sites, the spacer sequences are generally compatibility sequences 

and the repeats (or parts of the repeats) are recognition sequences. 
Recombinases require specific recognition sequences but allow wide variation 
15 in con^atibility sequences. Thus, recombination sites that arc recognized by a 
given recombinase but are incompatible with each other can be readily designed 
using the disclosed principles. 

Also disclosed are variant recombinases made or identified by the 
30 disclosed method dial have broadened specificity for the site of recombination. 

20 Specifically, the disclosed variants mediate recombination between sequences 
other tiian recombination sites on vdiich the wild type recombinase is active. In 
general, the disclosed recombinase variants can mediate efficient recombination 
between recombination sites that wild type recombinase can act on (referred to 
as wild type recombination sites), between variant recombination sites not 
25 efficientiy utilized by wild type recombmase (referred to as variant 

recombination sites), and between a wild type recombination site and a variant 
recombination site. 

Also disclosed are methods of recombining nucleic acids using the 
disclosed variant recombinases. For example, the disclosed variant 
30 recombinases can be used in any method or technique where vwld type 

recombinases can be used. In addition, the disclosed variant recombinases 
allow different alternative recombinations to be performed since the variant 
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rccombinases can allow much more efficient recombination between wild type 
recombination sites and variant recombination sites. Control of such alternative 
recombinauon can be used to accomplish more sophisticated sequential 
recombinations to acluevc results not possible with wild type lecombinases. 
10 5 The disclosed variant recombmases also aJIow recombination at specific 

genomic sites without the need to first introduce a recombination site. 

Also disclosed are variants of Cre recombinase that have broadened 
specificity for the site of recombination. Specifically. Ae disclosed variants 
mediate recombination between sequences other than the loxP sequence and 
other lox site sequences on which wld type Crc recombinase is active. In 
general, the disclosed Cre variants mediate efficient recombination between lox 
20 sites that wild type Cre can act on (referred to as wild type lox sites), between 

variant lox sites not efficienUy utilized by wild type Cre (referred to as variant 
lox sites), and between a wild type lox ate and a variant lox site. Also disclosed 
15 are methods of recombining nucleic acids using the disclosed Cre variants. For 
example, the disclosed Cre variants can be used in any method or technique 
where Cre recombinase (or other, similar rccombinases such as FLP) can be 
used, in addition, the disclosed Cre variants allow different alternative 
recombinations to be performed since the Cre variants allow miich more 
20 effident recombinauon between wild type lox sites and variant lox sites. 
Control of such alternative recombination can be used to accomplish more 
sophisticated sequential recombinations to achieve results not possible with 
wild type Cre recombinase. 

BRIEF DESCRIPTION OF THE DRAWINGS 
25 Figure 1 is a comparison of three different lox sites. loxP is the original 

recombination site for Cre recombinase. loxKl and loxK2 are variant lox sites. 

Figure 2 is a diagram of two different forms of construct and the 
resulting recombination products. 

Figure 3 is a diagram of an example of a random mutagenesis using 

30 DNA shuffling. 

Figure 4 is a diagram of the selection plasmid for loxK2 recombination, 
pBS584. Recombination of two loxK2 sites by a potent Cre mutant wUI result 
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in the excision of EGFP and the transcriptional terminator rrnBTJi. 
^ Subsequently, neo transcription can take place, rendering £. coli resistant to 

kanamycin. Note that the promoter (pRSV) even tho.«h of eukaryotic origin 
v«s shovm to be fimctional in £ coH (Antonucci et al., J. BioL Chem., 
10 5 264:17656-17659(1989)). 

Figure 5 depicts gels of nucleic acid fragments and PGR products 
generated during the DNA shuffling process. 

Figure 6 is a diagram of plasmid pBAD33 used for expression of mutant 

ere pools. 

Figure 7 is a diagram of the constniction of selection plasmids pBS568 
andpBSS69. 

Figure 8 is a diagram of the construction of selection plasmids pBS583 

a]idpBS5g4. 

Figure 9 is a diagram of control plasnud pBS6l3. 
Figure 10 is a diagram of the construction of screerang plasmids pBS601 
andpBS602. 

Figure 11 is a diagram of examples of basic types of constructs usefiil in 
&e disclosed method. These types of constructs are: (1) interrupted constructs 
«b«c the gene is interrupted by a nucleic acid segment (which is flanked by 
20 recombination sites) that is deleted during recombination. (2) flanked constructs 
vikat the gene as a unit is flanked by recombination sites and the gene is 
deleted by recombination, and (3) inverted constructs where a portion of the 
35 gene is on an inverted nucleic acid segment and recombination causes the 

segmenttoinvertandreconstitutetheintactgene. n« type of recombination is 

25 indicated in parentheses. 

Figures 12A, 12B, and 12C are diagrams of examples of constructs and 
■ theirexpectediecombinationwhenusedinthcdisclosedmethod. Figure 12A 

shows examples of deletion constructs (flanked aiid interrupted). Figure 12B 
shows examples of inverted constnicts. Figure 12C shows examples of 
45 30 constructs that combine through recombination to reconstitute an intact gene. 

Figure 13 is a diagram showii^ fte identified amino-acid changes in the 
six selected Cre mutants are listed according to their position in the protein's 
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secondary structure (silent mutations in parenthesis). Only one amino acid 
' change. E262G, is common to all mutants ^ith remarkably increased /oxK2 

activity (R3M1. 2. 3. 5. and 6), sugeesting that this m-rtation is essential for the 
observed phenotype. 

,0 . 5 Figure MA is atable comparing recombination frequencies in vivo 

obtained withavariety of tesites altered at positions 11 and 12. Figure HB is 
atable comparing recombination frequencies in vivo obtained identical and 

mixed lox sites. Wild type Cre and five difierent mutant enzymes were tested 
" fortheirperformanceondif&renttox'subsbates.asindicated. Givena«the 

10 obtainedpercentagesofrecombination/nvn^basedonthedescriMnegattv^ 

selection. 

20 Figme 15 is a table comparing recombination frequencies in vitro 

obtained with a variety of i<w sites altered at positions 11 and 12. 

Figure 16 is a graph of percent of various Cre recombinases (wt, G, GA, 
1 5 GN, OS, R3M3) bound to various lox sites (loxP, loxK2, loxKl). 
25 piguie 1 7 shows wildtype and target FRT sites. 

Figure 18 shows the strategy for selection of altered specifidty FLP 
mutants. 

^ Figure 19 shovw an ahetnate target mutant FRT site. The design and 

20 rationale for design ofthe target mutant FRT site is as described in Figure 17. 

but the mutant FRT-M2 site differs fiom FRT-M by carrying a different 
mutational alteration in the repeat elements. 
35 DEFAILH) DESCRIPTION OF THE INVENTION 

Disclosed is a method for identifying variant forms of recombinases that 
can mediate recombination between variant recombination sites. THe method 
involves producing mutant recombinases and testing the mutant recombinases 
withspeciaBydesignedConstructs. Tlie constructs comain variant 
recombination sites that are not recognized by non-mutant recombinase but will 
undergo recombination in the presence of a mutant recombinase with altered 
spccificily. The disclosed method also involves determining whether a variant 
recombinase retains the ability to mediate recombiwrtion at recombination sites 
recognized by non-variant recombinase. 
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When variant recombinases are tested for activity on both variant 
recombination sites and recombination sites recognized by non-variant 
recombinase in the same system or at the same time, it is prefcired that two 
different reporter genes which can be separately detected or monitored be used. 
5 In fhis case, a first reporter gene can be associated with the variant 
recombination sites and a second reporter gene can be associated with 
recombination sites recognized by non-variant recombinase. It is also ptefeired 
that the variant recombination ates be made incompatible with the 
" recombinationsitesrecogiii2edbynon-variaiit.ecombinasebyiisingdiffei™^ 

10 compatibility sequences forthe two sets of recombination sites. TTiis allows 
sq«rate assessment of the ability of a variant recombinase to mediate 
recombination between variant recombination sites and recombination sites 
recognized by non-variant recombinase. 

Also disclosed are variant recwnbinases made or identified by the 
15 disdosedmethodthathavebroadenedspecifidtyforthesiteofrecombinalioa 
25 Also disclosed are methods of recombining nucleic acids using the disclosed 

variantrecombinases. For example, the disclosed variant recombinases can be 
used in any method or technique where wild type lecombhiases can be used. In 
addition, the disclosed variant recombinases allow differoit alternative 
20 recombinations to be performed since the variant recombinases can allow much 
more efficient recombination between wild type recombination sites and variant 
recombination sites. Control of such alternative recombination can be used to 
35 accomplish more sophisticated sequential recombinations to achieve results not 

posaUe with wild tfps recwnbinases. 
25 Also disclosed are variants of Cre recombinase that have broadened 

specificity for the site of recombination. Specifically, the disclosed variants 
medialcrecombinationbetweensequencesotherthanthelaxPsequenceand 
ofter lox ate sequences on which v«ld Qrpe Cre recombinase is active. 
Preferred forms of Ae disclosed Cre variants have the amino acid sequence 
45 30 SEQ ID N0:1 (top sequence. Table 11) altered by one or more amino acid 

substitutions, deletions, or insertions, where the glutamic acid at amino add 262 
has been substituted witit an amino add other than glutamic add. and where the 
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etc variant recognizes (that is. mediates recombination at) a variant lox 
recombination site. Useful Cre variants include proteins that recognize a variant 
loxrecombinationsiteandhavetheaminDacidsequenceSEQlDNO:! altered 

by substitution of the glutamic acid at amino acid 262 «ith an itnino acid other 
5 thanglutamicaddamloncormoreofthefollowngaminoacidsubstitutions: 
isoleucine at amino acid 16, alanine at amino acid 29, glulamine at amino acid 
101, glycine at amino acid 138, aspatagine at amino acid 189. serine at amino 
acid 198, glutamine at amino acid 220. glutamine at amino acid 223, isoleucine 
at amino acid 227, glycine at amino acid 254, argimne at amino acid 255, 
10 glutamine at amino add 284. leucine at amino acid 307, and serine at amino 

acid 316. Preferred amino acid substitutions at amino acid position 262 include 
alanine, tryptophan, or glycine. 

Examples of prefeired Cie variants include protons having the amino 
acid sequence SEQ ID N0:1 altered by substitutions E262G and D189N; 
15 proteinshavingdieaminoacidsequenceSEQIDNO-.lalteredbysubstitutions 

E262G and T3 16S; proteins having the amino acid sequence SEQ ID N0:1 
altered by substotions E262G and D29A; proteins having the amino acid 
sequence SEQ ID N0:1 altered by substitutions E262G, V16I. D189N, G198S. 
R223Q, Q255R, and P307L; proteins having ttie annno acid sequence SEQ ID 
20 N0:1 altered by substitution E262G; proteins having the ammo acid sequence 
SEQ ID NO:l altered by substitution E262A; and proteins having die amino 
acid sequence SEQ ID N0:1 altered by substitution E262W. The substitutions 
above are Hsted using fte convention v*ere tfie first letter is tiie original amino 
add Cm single letter amino add code), the numba is tiie amino acid position in 
25 tiie protdn (in tiiis case, using tiie positions of wUd type Cre (SEQ IDNO:!)), 
and tiie last letter is tiie new amino acid (in single letter amino acid code). All 
of tiiese Cre variants recognize botii wrild type lox sites and variant lox sites 
vritiian inverted repeat sequence NNNACNNCOTATA (SEQ IDN0:2). 

The (fisdosed Cte variants recognize variant lox recombination sites. 
30 Preferred variant lox sites ate variant lox sites recognized by tiie Cre variant but 
not recognized by wild type Cte. Examples of usefiil variant lox sites indude 
sites having two 13 base pair mverted repeats flanking 8 base pairs, where one 
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Of the inverted repeats has the sequence NNNACNNCGTATA (SEQ ID N0:2); 
sites having the sequence N,N,N3ACN,N5CGTATANNNNNNNNTATA 
CGNsWGTNj'Nz'N," (SEQ ID NO:3), where N,', Ni', N,'. N4', and Ns' are 
complementary to N,. Ni. N3. N4, and Nj, respectively; sites having the 
5 seqtience N,N2N,ACN4N5CGTATANNNNNNNNTATACGN5'N,'GN3WN,' 
(SEQ ID NO:3), where N4N5 are AA, TC. GT, TO. GG. or CC; and sites having 
the sequence GATACAACGTATATACCTrrCTATACGTTGTAKSEQ ID 

N0:4). 

Also disclosed is a method for producing site-specific recombination of 
10 DNA in cdls UMt« the disclosed Cie varianu. DNA sequences comprising first 
and second lox sites are introduced into celU and contacted with a Cre variant, 
thereby producing recombination at the lox sites. As with wild type Ore, the 
locatiim and orientation of the lox sites determines the nature of Ae 
recombination. 

15 As used herein, the expression "site-specific recombination" refers to 

fluee different types of recombination events: 

1. deletion of apre-selected DNA segment flanked by recomWnation 

ates, 

2. invetaon of the nucleotide sequence of a pre-selected DNA segment 
20 flanked by recombination sites, and 

3. reciprocal exchange of DNA segments proximate to recombination 
sites located on different DNA molecules. 

It is to be understood that this reciprocal exchange of DNA segments 
can result in an integration event if one or both of the DNA molecules are 
25 circular. "Nucleic acid segment" refers to a linear segment of single- or double- 
stranded nucleic acid, which can be derived from any source. The s^inent may 
be a fiagment conasting of the segment or a segment within a larger nuclric 
acid fiagment or molecule. The expression "nucleic acid in eukatyotic cells" 
includes all nucleic acid present in eukaryotic cells. The expression "nucleic 
30 add in yeast" includes all nucleic acid present in yeast cells. "DNA segment" 
refers to a linear segment of single- or double-Stranded deoxyribonucldc add 
OJNAX which can be derived ftom any source. The expression "DNA in 
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eukaryotic cells" includes all DNA present in eukaryotic cells. The expression 
"DNA in yeast" includes all DNA present in yeast cells. As used herein, a 
*gcne" is intended to mean a DNA segment which is normally regarded as a 
gene by those skilled in the art The expression "tegulatoiy molecule" refers to 
a polymer of ribonucleic acid (RNA) or a polypeptide which is capable of 
enhancing or inhibiting expression of a gene. 

"Regulatory nucleotide sequence," as used herein, refers to a nucleotide 
sequence located proximate to a gene whose transcription is controlled by the 
regulatory nucleotide sequence in conjunction with the gene expression 
aijparatus of the cell. Generally, the regulatory nucleotide sequence is located 5* 
to the gene. The expression "nucleotide sequence" refers to a polymer of DNA 
or RNA, which can be single- or double-stranded, optionally containing 
synthetic, non-natural, or altered nucleotides capable of incorporation into DNA 
or RNA polymers. As used hracin, a "regulatory nucleotide sequence" can 
include a promoter region, as that term is conventionally employed by those 
skilled in the art. A promoter region can include an association region 
recognized by an RNA polymerase, one or more regions vAach control the 
effectiveness of transcription initiation in response to physiological conditions, 
and a transcription initiation sequence. "Gene product" refers to a polypeptide 
20 resulting ftom transcription, translation, and, optionally, post-translational 
processing of a selected DNA segment. 

Materials 

A. Recombinases 

Recombinases suitable for use in the disclosed method include any 
25 enzyme that mediates recombination at specific sites. This includes enzymes 
identified as recombinases as well as other enzymes that function to produce 
lecombinatimi such as integrases and resolvases. As used herein, recombination 
at ^)ecific sites does not refer only to recombination at completely defined 
sequences. Rather, a recombinase is considered to mediate recombination at 
30 specific sites when the sites of recombination arc limited in some way by 

sequence. For example, wild type Cre recombinase mediates recombination 
between sites having tiic sequence N1N2N3ACTTCGTATANNNNNNNNT 
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ATACGAAGTN3'N2*Ni\ which includes both specific and non-specific 
sequences. The sequences ACTTCGTATA and TATACGAAGT (an inverted 
repeat of the fiist sequence) are recognized by the Cre recombinase. The non- 
specific sequences (positions with 'V" in the recognition sequence), although 
5 not limited in sequence, must be compatible with the non-specific sequences of 
the partner recombination site in order for recombination to be efficient. The 
recombination sites need not have any particular number of specific nucleotides. 
All that is required is some constraint on the sequence of the site such that the 
recombinase is limited to recombination at some set of sites. 
10 Examples of recombinases that can be used in the disclosed method 

include Cre recomb'mase, FLP recombinase, Beta recombinase of pSM 19035 
(Diaz et al., J Biol Chem 274: 6634-6640 (1999)), Int recombinases (Nunes- 
Daby et al,,Nucieic Acids Res, 26:391-406 (1998)), and resolvases (Hallet et 
sl.,VEMS Microbiol Rev. 21: 157-178 (1997); Oram et al., Curr 5fo/. 5: 1106- 
15 1 1 09 (1995); Mondragon, Structure 3: 755-758 (1995)). 
B. Recombination Sites 

Recombination sites are locations within a nucleic acid whwe 
reoombmation mediated by a recombinase takes place. Recombination sites 
generally include specific sequences, referred to as recognition sequences, 
20 through which the recombinase recognizes a given nucleotide sequence as a 
recombination site. Different recombinases generally recognize different 
recognition sequences. Recombination between two recombination sites 
requires (1 ) that the recombinase recognize the sites as recombination sites, and 
(2) tiiat the sequences of the two sites are sufficiently similar. It has been 
25 discovered that recombination between two recombination sites (both of which 
are recognized by a recombinase) can be substantially reduced or prevented by 
using different compatibility sequences for the recombination sites (the 
recognition sequences can also differ if the recombinase can recogmze different 
sequences). Thus, it is ako preferred that the variant recombination sites be 
30 made mcompatible with the recombination sites recognized by non-variant 
recombinase by using different compatibility sequences for the two sets of 
recombinafion ates. Conqatibility sequences in a recombination site are those 
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g sequences in the recombination site, other than the sequences required for 

recognition of the site by the recombinasc, that must be similar in a pair of 
recombination sites foriccombination to occur between them. Generally, 
recombinases require specific recognition sequences but allow wide variation in 
10 5 compatibility sequences. Thus, recombination sites that are recognized by a 

given recombinase but are incompatible with each other can be readily designed 
using the disclosed principles. 

It should be understood that, for a given recombinase site or for a given 
recombinase, whether a given base position in the recombinadon site is a 
10 recognition sequence base or a compatibility sequence base may depend on 
other sequences in the recombination site. For example, a particular base may 
20 function as a compatibility sequence base in a recombination site having one 

sequence while the same base may function as a recognition sequence base in a 
recombination site having a chffereni sequence. It should also be understood 
1 5 ftat recognition sequences and compatibility sequences do not necessarily occur 
in blocks within a recombination site. That is, recognition sequence base and 
compatibility sequence bases may be interspersed in a given recombination site. 
As discussed below, what is and is not a recognition sequence or a compatibility 
3Q sequence in a given recombination site is determined functionally. 

20 The disclosed variant recombination sites and the variant recombinases 

that can act on them allow more freedom in the selection of sites of 
recombination. In particular, the disclosed variant recombinases can allow 
amino acid changes in a protein of interest while retaimng the ability to 
recoihbine at a ^ven site. 
25 1. Recognition Sequences 

^ Recognition sequences are regions within a recombination site that must 

have a specific sequence, or defined range of sequences, for the cognate 
recombinase to recognize the recombination site. Recognition sequences in a 
recomWnation site need not be contiguous. Thus, required nucleotides 
^ 30 dispersed in a recombination site are collectively considered recognition 

sequences. Nucleic acid segments can be said to have a defined range of 
sequences when every nucleotide position m the nucleic acid segment(s) is 
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limited to one, two, or three nucleotide bases. That is, so long as a nucleotide 
position cannot have one of the possible nucleoUde bases, that position has a 
defined range of sequence. For example, a nucleotide sequence ATRVBYGC 
has a defined range of sequences since each nucleotide position has at least one 
5 limitation. Standard nomenclature for nucleic acid sequences is used herein. 
Thus, in this example, R represents A or G; V represents A, C, or G; B 
represents C, G, or T; and Y represents C or T. 

Recognition sequences for recombinases are known or can be 
dcteraiined through routine analysis. In general, recognition sequences can be 
determined by varying the sequence of recombination sites and detennining if 
recombination between the sites can still occur. For this purpose, the pair of 
sites to be recombined should be identical. That is, the same sequence changes 
should be made to both sites. This eliminates any incompatibility effect 
between the recombination sites. If recombination is eliminated or significanUy 
15 reduced When a specific nucleotide is changed, then that nucleotide is required 
for recognition of the recombination site by the recombinase. 

Examples of dissection of the critical sequences in recombination sites 
of recombinases are described by Hoess ct al.. Nucleic Acids Res. 14:2287-2300 
(1986) (involving PI recombinase); Sauer B., Nucleic Acids Res., 24:4608-4613 
20 (1996) (involvmg Cre recombinase); Lee and Saito, Gene 216(l):55-65 (1998) 
(involving Cre recombinase); and Umlauf and Cox, £MBO J 7(6): 1845-52 
(1988) (involving FLP recombinase). Similar techniques can be used to 
determine the recognition sequences of other recombinases. 
2. Compatibility Sequences 
25 Compatibility sequences are regions in a recombination site that must be 

s'unilar m a pair of recombination sites for recombination to occur between 
them. In general, the sequence of recombination sites must be similar for 
recombination to occur between them. Examples of compatibility sequences 
are spacer sequences between repeats in recombination sequences. All or some 
30 of the nucleotides in the recognition sequences for a recombination site may be 
involved in compatibiUty. For example, where some degeneracy of the 
recognition sequences is allowed, similar recognition sequences may be 
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g required in a pair of recombination sites for recombination to occur between 

them. Thus, compatibility between rccombmation sites can be affected by using 
different sequences in the compatibility sequences other than the sequences 
requticd for recognition of the site by the recombinase (that is, recognition 
10 5 sequences), compatibility sequences that are part of the recognition sequences, 

or both. It is preferred that compatibility between recombination sites be altered 
by using different sequences in the compatibility sequences other than the 
sequences required for recognition of the site by the recombinase. 

Compatibility sequences for many recombinases are known or can be 
10 determined through routine analysis. In general, compatibility sequences can be 
easily determined by varying the sequence of recombination sites and 
20 determining if recombination between the sites can still occur. For this purpose, 

only one of the sites in the pair of sites to be recombined should be altered. 
That is, the same seqiience changes should not be made to both sites. This 
15 isolates incompatibility effect between the recombination sites. Further, only 
those nucleotide positions that are not a part of the recognition sequence of the 
site should be altered to avoid recognition problems. If recombination is 
eliminated or significantly reduced when a specific nucleotide is dianged, then 
30 that nucleotide is required for compatibility of the recombination site. 

20 Examples of dissection of the critical sequences in recombination sites 

of recombinases are described by Hoess et al.. Nucleic Acids Res, 1 4:2287-2300 
(1986) (involving PI recombinase); Sauer B., Nucleic Acids Res., 24:4608-4613 
(1996) (involving Cre recombinase); Lee and Saito. Gene 216(l):55-65 (1998) 
(involving Cre recombinase); and Umlauf and Cox, EMBO J 7(6): 1 845-52 
25 (1988) (involving FLP recombinase). Similar techniques can be used to 
^ determine the compatibility sequences of other recombinases. 

Recognition and compatibility sequences can be further tmderstood 
using Cre recombination sites as an example. Wild type Cre recombinase 
mediates recombination between ^tes having the sequence 
^5 30 NiNzNaACtTCGTATANN NNNNNNTATACGAAGTNj'Na'N, which 

includes both specific and non-specific sequences (that is, recognition 
sequences and compatibility sequences, respectively). The sequences 
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ACTTCGTATA and TATACGAAGT (an inverted repeat of the first sequence) 

5 

are recognized by the Cre recombinase and arc the recognition sequences in Cre 
recombinase sites. Variant Cie rccombinases recognize sites having diffeicnl 
recognition sequences. The non-specific sequences (positions with "N" in the 
10 5 lecognitioh sequence), although not limited in sequence, must be compatible 

with the non-specific sequences of the partner recombination site in order for 
recombination to be efficient- Thus, the non-specific sequences are the 
compatibility sequences of a recombinase site. 

15 

C. Recombination Constructs 

10 Recombination constructs are designed to provide an observable change 

when recombination between recombination sites occurs. Preferred 
20 recombination constructs include two pairs of recombination sites, one pair 

having a variant sequence and another pair having a sequence recognized by 
non-mutant recombinase (for example, wild type recombinase). Sites in the 
15 first pair are referred to as variant recombination sites. Generally, 

recombination constructs include a first nucleic acid sequence that includes a 
first reporter gene and furst and second recombination sites, \^cre the first and 
second recombination sites are variant recombination sites, and a second nucleic 
acid sequence that includes a second reporter gene and third and fourth 
20 recombination sites, where the third and fourth recombination sites can be 
recombined by a non-mutant recombinase. The first and second nucleic acid 
sequences need not be present on the same vector or on the same nucleic acid 
molecule (for example, the chromosome), although this is preferred. It is 
preferred that recombination constructs be embodied in vectors, such as 
25 plasmids. 

In one embodiment of the disclosed recombination constructs, the 
sequence of the recombination sites in the constructs are chosen such that the 
recognition sequences of the first and second rccombmation sites differ fiom the 
recognition sequences of the third and fourth recombination sites. The sequence 
^ 30 of the recombination sites can also be chosen such that die compatibility 

sequences of the fust and second recombination sites differ from the 
compatibility sequences of the third and fourth recombination sites such that the 
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firsl and second recombination sites cannot recombine with the third and fourth 
recombination sites. The sequence of the recombination sites can also be 
chosen such that the compatibility sequences of Ae first and second 
recombination sites are sufficiently similar to allow recorabmation between the 
5 first and second recombination sites, and such that the compatibUity sequences 
of the third and fourth recombination sites are sufficiently similar to allow 
recombination between the third and fourth recombination sites. The above 
sequence relationships result in constructs where the first and second 
recomWnatitm sites can recombine (in ae presence ofarecombinase that 

10 recognizes the sites), the tWrd and fourth recombination sites can recombine, 
but wtee the neither die first nor second recombination site can recombine with 
either the third or fourth recombination site (since differences in the 
compatibility sequences prevent recombination). 

Arriving at recombination sites hawng relationships as described above 
15 is preferably accomplished in the foUowingw^. Starting with a given 
rccombinaUon site sequence (which can be recombined by a non-mutant 
recombinase), parallel changes are made in the compatibUity sequences of the 
first and secomi recombination sites. These altered recombination sites should 
then be tested to make sure that the non-mutant recombinase can still mediate 
20 their recombination. Tliis helps insure that compatibility sequence changes 
have not inadvertenUy affected the ftmction of the recombination sites. Once 
this is confimied. changes can be made to the recognition sequences of the first 
and second recombination sites. These cbinges result in variant recombination 
sites for which variant recombinases can be identified ustog the method 
25 disclosed herdn. The resulting recombination sites have the desired propertied: 
incompatibility with the third and fourth recombination sites and variant 
recognition sequences that extend the range of recombination-competent sites. 

The rewHnWnation idtes can have a variety of properties and 
relationshipsthatmakethemiisefalforparticularpurposes. For example, the 
30 recombination sites can be designed such that the first and second 

recombination sites cannot be recombined by non-mutant recombinase to a 
significant extent This allows separate assessment of cleavage by mutant and 
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non-mutant recombinase. It is also useful if the first and second recombination 

sites have identical sequences, and the third and fourth recombination sites have 

identical sequences. 

Recombination between the recombination sites can have a variety of 

10 5 effects that allows detection of recombination. For example, the constructs can 

be designed such that recombination between the first and second 
recombination sites alters the expression of the first reporter gene, wtere 
recombination between the furst and second recombination sites is determined 
by determining if expression of the first reporter gene is altered; recombination 
10 between the third and fourth recombination sites alters the expression of the 
secoiid reporter gene, where recombination between the third and fourth 
20 recombuwlidn sites is determined by determining if expression of the second 

reporter gene is altered; recombination between the first and second 
recombination sites allows the first reporter gene to be expressed; the first 
15 nucleic acid sequence includes a spacer sequence flanked by the furst and 
second iccombination sites, where the spacer sequence interrupts the first 
reporter gene sudi that the first reporter gene is not ^pressed, and where 
recombination of the first and second recombination sites excises the spacer 
sequence which allows the first reporter gene to be expressed; and/or a portion 
20 of the first reporter gene is inverted, wherein the inverted portion of the first 
reporter gene is flanked by the first and second recombination sites, wherein 
recombination of the first and second recombination sites inverts the inverted 
^ portion of the first reporter gene which allows the first reporter gene to be 

expressed. 

25 The constructs can also be designed such that recombination between 

the first and secoild recombination sites prevents expression of the first reporter 
gene; the first reporter gene is flanked by the first and second recombination 
sites, where recombination of the first and second recombination sites excises 
the first reporter gene which prevents expression of the first reporter gene; a 
^ 30 portion of the first reporter gene is flanked by the first and second 

recombination sites, vrfiere recombination of the first and second recombination 
ates inverts the flanked portion of the first reporter gene which prevents 
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expression of the first reporter gene: recombination between the third and fourth 
recombination sites allows the second reporter gene to be expressed; and/or the 
second nucleic acid sequence includes a spacer sequence flanked by the third 
and fourth lecoiiibuiation sites, where the spacer sequence interrupts the second 
5 reporter gene such that the second reporter gene is not expressed, and where 
recombination of the third and fourth recombination sites excises the spacer 
sequence which allows the second reporter gene to be expressed. 

The constructs can also be designed such that a portion of the second 
reporter gene is inverted, where the inverted portion of the second reporter gene 
1 0 is flanked by the third and fourth recombination sites, and where recombination 
of the third and fourth recombination sites inverts the inverted portion of the 
20 second reporter gene which allows the second reporter gene to be expressed; 

recombination between the third and fourth recombination sites prevents 
expression of the second rqrorter gene to be expressed; the second reporter gene 
15 is flanked by the thitd and fourth recombination sites, where recombination of 
the third and fourth recombination sites excises the second reporter gene which 
jprevents expression of the second reporter gene; and/or a portion of the second 
ieporter gene is flanked by the third and fourth recombination sites, where 
30 recombination of the third and fourth recombination sites inverts the flanked 

20 portion of the second reporter gene which prevents expression of the second 
reporter gene. 

Expression of a reporter gene can include transcripiion of the gene, 
translation of the transcript (if the gene encodes a protein), and/or production of 
an active protein. As used herein, whether a reporter gfcne is expressed depends 
25 on the context. In general, a gene is considered to be expressed if it produces 
the expres^on product to be detected. Such expression products include full or 
partial transcripts of the gene, full or partial proteins, includmg active or 
inactive forms of the proteins, translated from the transcript. Since the goal in 
u^ng reporter genes in the disclosed method is the detection of expression, any 
30 ofthesefonnsof expression product can be the object of detection. For 

example, if the gene's transcript is to be detected, the gene will be considered to 
be expressed if it produces the transcript, regardless of whefter transcript is 
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translated or whether the resulting protein is active. If the an active protein 
encoded by the gene is to be detected, the gene is not expressed unless active 
protein is produced-mere transcription of the gene, or even translation to 
produce an inactive jaotein. mil not be enough in this context As a 
5 consequence, the expression product to be detected will influ«ice the manner in 
which leporter gffles should be interrupted or invented in the disclosed 
constructs. For example, nearly any intemiption of a reporter gene would 
prevent expression of an active protein encoded by the gene. On the other hand, 
an mtenuption of the coding region *ill usually not prevent production of a 
10 transcript. The structure of the disclosed constructs should be designed with 
these principles in mind. As used herein, an inactive expression product refers 
to an expression product that does not have an activity exhibited by the active 
form of the expression product where that activity is required for detection of 
cxpiesaon in Ae assay schHhe being used. 
15 The constructs can be designed such that the first nucleic acid sequence 

is a first nucleic acid construct and the second nucleic acid sequence is on a 
second nucleic acid construct; the first nucleic acid construct is an 
cxtiachromosomal vector and the second nucleic acid construct is in the genome 
of a host cell; and/or the first and second nucleic acid constructs are on the same 
20 nucleic acid construct 
D. Reporter Genes 

Reporter genes are used to inonitor whether recombination occurs in the 
disclosed constructs. Reporter genes can be any gene the expression of which 
can be detected dther diiectiy or indirectly. These include genes encoding 
25 enzymes, such as p-galactosidase, luciferase, and alkaline phosphatase, that can 
produce specific detectable products, and genes encoding proteins that can be 
directiy detected. Virtually any protein can be direcUy detected by using, for 
example, specific antibodiesto the protein. A ptefetred reporter protein that can 
be directly detected is the green fluorescent protein (GFP). GFP. &om the 
30 jellyfish^e^uorea victoria, produces fluorescence upon exposure to ultraviolet 
Ught without the addition of a substrate (Cbalfie el at.. Science 263 :802-5 
(1994)). A number of modified GFPs have been created that generate as much 
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as 50-fold greater fluorescence than does wild type GFP under standard 
conditions (Connack et al. Gene 173:33-S (1996); Zolotukhin et al,, J. Virol 
70:4646-54 (1996)). This level of fluOTCScence allows ibe detection of low 
levels of expression m cells. 
5 Reporter genes encoding proteins producing a fluorescent signal are 

useful since such a signal allows cells to be sorted using FACS. Another way of 
sorting cells based on expression of the reporter gene involves using the reporter 
protein as a hook to bind ceils. For example, a cell surfece protein such as a 
receptor protem can be bound by a specific anybody. Cells expressing such a 
10 protein can be captured by, for example, using antibodies bound to a solid 
substrate, using antibodies bound to magnetic beads, or capturing antibodies 
bound to the reporter protem. Many techniques for the use of antibodies as 
capture agents are known and can be used with the disclosed nlethod. 

The reporter gene can also encode an expression product that regulates 
15 the esqwcssion of another gene. This allows detection of expression of the 
reporter gene by detecting expression of the regulated gene. For example, a 
repressor protein can be encoded by the reporter gene. Loss of expresaon of 
the reporter gene (via recombination) would then result in derepression of the 
regulated gene. This type of indirect detection allows positive detection of loss 
20 of the expression of the reporter gene by the affector RNA molecule. One 
preferred form of this type of regulation is the use of an antibiotic resistance 
gene regulated by a repressor protein encoded by the reporter gene. By 
exposing the host cells to the antibiotic, only those cells in which expression of 
the reporter gene has been inhibited will grow since expression of the antibiotic 
25 resistance gene will be derepressed. 
£. Expression Sequences 

The reporter genes can be expressed using any suitable c3q>reSsion 
sequences. Numerous expression sequences are known and can be used for 
expression of the reporter genes. Expression sequences can generally be 
30 classified as promoters, terminators, and, for use in eukaryotic cells, enhancers, 
Expresaon in prokaryotic cells also requires a Shine-Dalgarno sequence just 
upstream of the coding region for proper translation initiaticKi. Inducible 
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promoter are preferred for use uith the first reporter gene since it is prefeired 
that expression of the first reporter gene be adjustable. 

Promoters suitable for use with prokaryotic hosts illustratively include 
the p-lactamase and lactose promoter systems, tetracycline (tet) promoter. 
5 alkaline phosphatase promoter, the tryptophan (tip) promoter system and hybrid 
promoters such as the tac promoter. However, many other functional bacterial 
promoters are suitable. Their nucleotide sequences are generally known. 

Suitable promoting sequences for use with yeast hosts include the 
promoters for 3-phosphoglycerate Idnase, enolase. glyceraldehyde-3-phosphate 
10 dehydrogenase, hexokinase, pyruvate decarboxylase, phosphofructokinase, 
glucose-6.phosphate isomerase, 3-phosphoglycerate mutase. pyruvate kinase, 
triosphosphate isomerase, phosphoglucose isomerase, and glucokinase. 
Examples of inducible yeast promoters suitable for use in the disclosed vectors 
include the promoter regions for alcohol dehydrogenase % isocytochrome C, 
15 acid phosphatase, degradative enzymes associated with nitrogen metabolism, 
metallothionein, giyceraldehyde-3 -phosphate dehydrogenase, and enzymes 
responsible for maltose and galactose utilization. Yeast enhancers also are 
advantageously used with yeast promoters. 

Preferred promoters for use in mammaUan host cells include promoters 
20 from polymoma virus. Simian Virus 40 (SV40), adenovirus, retroviruses, 
hepatitis B virus, herpes simplex virus (HSV), Rous sarcoma virus (RSV), 
mouse mammary tumor virus (MMTV), and most preferably cytomegalovirus 
(CMV), or from heterologous mammalian promoters such as the p actin 
promoter. Particularly preferred are the early and late promoters of the SV40 
25 virus and the immediate early promoter of the human cytomegalovirus, MMTV 
LTR, RSV-LTR^ and the HSV thymidine kinase promoter. 

Transcription of the reporter gene by higher eukaryotes can be 
increased by inserting an enhancer sequence into the vector. Many enhancer 
sequcaices are now known fioro mammalian genes (globin. elastase, albumin. 
30 and insulin). Typically, however, one will use an enhancer from a eukaryotic 
ceU wus. Examples include the SV40 enhancer on the late side of the 
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replication origin, the cytomegalovirus early promoter enhancer, the polyoma 
enhancer on the late side of the replication origin, and adenovirus enhancers. 
The disclosed vectors preferably also contain sequences necessary for 
accurate 3' end formation of both reporter and affcctor RNAs. in enkaryotic 
5 cells, tMs would be a polyadcnylation signal. In piokaryotic cells, this would be 
a transcription terminator. 

Method 

A. Identification of Variant Recombinases 

The disclosed mefliod involves producing mutant recombinases and 
10 testing the mutant recombinases with specially designed constructs. The 

constructs contain variant recombination sites that are not recognized by non- 
mutant recombinase but will undergo recombination in the presence of a mutant 
recombinase with altered specificit>% The disclosed method also involves 
determining whether a variant recombinase retains the abiUty to mediate 
1 5 recombination at recombination sites recognized by non-variant recombmase. 
This can be accomplished by using constructs containing recombination sites 
recognized by non-variant recombinase. Recombination at these recombination 
sites can be monitored or detected by any suhable means. It is preferred that 
recombination is detected by screening or selection based on the expression or 
20 lack of expression of a reporter gaie. This can be accomplished by using 

constmcts containing a reporter gene associated with the recombination sites 
such that the reporter gene is rearranged or deleted, or a spacer sequence 
interrupting tiie reporter gene is rearranged or deleted, as a result of 
recombin^on at the recombination sites. Recombination of such constructs 
25 will result in a loss of expression of die reporter gene, where tiie construct 

contained a functional reporter gene, or in a gain in expression of the reporter 
gene, where tiie construct contained a non-functional reporter gene. 
1. Production of Mutant Recombinases 

Mutant recombinases can be produced by any suitable technique. In 
30 general, aU that is required is a metiiod of generating a variety of recombinase 
proteins having a variety of amino acid sequences, tiie most preferred way of 
doing tiiis is to mutagenize or alter nucleic acid encoding die recombinase and 
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then expressing the mutant recombinascs. Numerous techniques for introducing 
alterations into nucleic acid sequences are known and can be used in the 
disclosed method. For example, alterations can be made by chenucal 
mutagenesis, introduction of degenerate nucleic acid fragments into the base 
5 nucleic acid molecule, and low fidelity PCR. The goal of this mutagenesis or 
alteration will be the generation of a population or set of mutant recombinascs 
having a variety of sequences. The broader the range of variants, the more raw 
materia] for the identification process. 

2. Identification of Variant Recombinases That Recognize Variant 
10 Recombination Sites 

Variant recombinases that can mediate recombination at variant 
recombination sites are identified in the disclosed method by selecting for, 
screening for, or otfierwise detecting recombination of specially designed 
constructs having variant recombinaiion sites. Recombination at variant 
15 recombination sites can be monitored or detected by any suitable means. It is 
preferred that recombination is detected by screening or selection based on the 
expression or lack of expression of a reporter gene. This can be accomplished 
by using constructs containing a reporter gene associated with the variant 
recombination sites such that the reporter gene is rearranged or deleted, or a 
20 spacer sequence interrupting the reporter gene is rearranged or deleted, as a 
result of recombination at the recombination sites. Recombinaiion of such 
constructs will result in a loss of expression of the reporter gene, where the 
construct contained a functional reporter gene, or in a gain in expression of the 
reporter gene, where the construct contained a non-functional reporter gene. 
25 3. Identification of Variant Recombinases That Recognize Non- 

Variant Recombination Sites 

Variant recombinascs that can mediate recombination at recombination 
sites lecognized by non-variant recombinasc (non-variant recombination ates) 
are identified in the disclosed method by selecting for, screening for, or 
30 otherwise detecting recombination of specially designed constructs having 

recombination sites recognized by non-variant recombinase. Recombination al 
these recombination sites can be monitoicd or detected by any suitable means. 
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It is preferred that recombination is detected by screening or selection based on 
the expression or lack of expression of a reporter gene. This can be 
accomplished by using constructs containing a reporter gene associated with the 
recombination sites recogmzed by non-variant recombinase such that the 
5 reporter gene is rearranged or deleted, or a spacer sequence interrupting the 
reporto- gene is rearranged or deleted, as a result of recombination at the 
recombination sites. Recombination of such constructs will result in a loss of 
expression of the reporter gene, where the construct contained a functional 
reporter gene, or in a gan in expression of the reporter gene, where the 
10 construct contained a non-fimctional reporter gene. 

It is preferred that the ability of a variant recombinase to mediate 
recombination at both variant recombination sites and recombination sites 
recognized by non-variant recombinase be assessed in the same system (such as 
a cell strain) cither sequentially or simultaneously. When variant rccombinases 
15 are tested for activity on both variant recombination sites and recombination 
sites recognized by non-variant recombinase in the same system or at the same 
time, it is preferred that two different reporter genes which can be separately 
detected or monitored be used. In this case, a first reporter gene can be 
associated with the variant recombination sites and a second reporter gene can 
20 be associated with recombination sites recognized by non-variant recombinase. 
Use of Variant Rccombinases 

Variant rccombinases produced in the disclosed method can be used for 
any purpose that unmodified rccombinases can be used. The advantage is that 
the variant rccombinases have a different or broader site specificity. In general 
25 the disclosed variant rccombinases can be used to mediate recombination of any 
nucleic acid in any setting, including in vitro, in cell culture, and in vivo. 
Recombination can be obtained in single celled organisms, such as bacterial 
cells, fungal cells, yeast cells, prolcaryotic cells, and archae bacterial cells, the 
cells of multicellular organisms, including plants and animals, both in the 
30 organism and in culture. The disclosed variant rccombinases can also be used 
in combination with other recombinases (including other variant rccombinases) 
having a different site specificity. Such combinations allow more complex 
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recombination schemes to be used. Examples of such schemes are discussed 
below. 

For some uses of the disclosed rccombinases, first, second, and fourth 
DNA sequences comprising a first recombination site, a second recombination 
5 site, and a third recombination site, respectively, are introduced into cells. As 
used herein the expression "recombination site" means a nucleotide sequence at 
which a recombinase or variant recombmase can catalyze a site-specific 
recombination. 

Methods for introducing a DNA sequence into cells are known in the art. 

10 These methods typically include the use of a DNA vector to introduce the 
sequence into the DNA of a single or limited number of eukaiyotic cells and 
then growing such cell or cells to generate a suitable population of cells. As 
used herein, the term "vector" includes plasmids, viruses, and viral vectors. 
Preferably, the DNA sequences are introduced by a plasmid capable of 

15 transforming a selected cell while carrying a DNA sequence. The particular 
vector which is employed to introduce the DNA sequence into a selected cell is 
not critical. 

In the present method, the recombination sites are contacted with a 

variant recombinase, thereby producing the site specific recombination. A 
20 preferred means of contacting the DNA to be recombined with a variant 
recombinase is to place the DNA to be recombined into a cell expressing 
nucleic acid encoding the variant recombinase. Preferably, expression of the 
variant recombinase is under the control of a regulatory nucleotide sequence. 
Suitable regulatory nucleotide sequences are known in the art. The regulatory 
25 nucleotide sequence which is employed with a selected eukaryotic cell is not 
critical to the metiiod. A partial list of suitable regulatory nucleotide sequences 
includes the long terminal repeat of Moloney sarcoma virus described by 
Blochlinger and Diggehnann, MoL Cell Bio,, 4:2929-293 1 (1984); the mouse 
metallbtiuonein-I promoter described by Pavlakis and Hamer, Proc, Natl Acad. 
30 Sci USA, 80:397-401 (1983); the long terminal repeat of Rous sarcoma virus 
described by Gonnan et al., Proc. Nati Acad Sci USA, 79:6777^781 (1982); 
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and the early region promoter of SV40 described by Southern and Berg, J, Mol 
Appi GeneL, 1:327-341 (1982). 

In an embodiment where the cells are yeast, suitable regulatory 
nucleotide sequences include GALl, GALIO, ADHl, CYCl, and TRP5 
5 promoters. GALl and GALIO promoters are present on plasmid pBMlSO which 
is described by Johnston and Davis, Molec. Ceil Biol, 4:1440 (1984). The 
ADHl promoter, also called ADC1» is present on plasmid pAAH5 which is 
described by Ammer. Methods EnzymoU 101:192 (1983). The CYCl promoter 
is described by Stiles et al,. Cell, 25:277 (1981). The TRP5 ptomoter is 
10 dcsciibed by Zalkin and Yanofsky, 1 Biol Chem., 257:1491 (1982). 
Preferably, the regulatory nucleotide sequence is a GALl promoter. 

In one embodiment where the cell is yeast, the first, second, and 
optionally, third and fourth DNA sequences are introduced into one strain of 
yeasL Alternatively, the DNA sequences are introduced into two different 
15 • strains of yeast of opposite mating types which are subsequently mated to form 
a single strain having all three or four DK A sequences. Preferably, the plasmid 
contains either (1 ) a nucleotide sequence of DNA homologous to a resident 
yeast sequence to permit mtegration into the yeast DNA by die yeasf s 
recombination system or (2) a nucleotide sequence of DNA which permits 
20 autonomous replication in yeast One nucleotide sequence which permits 

autonomous replication in yeast is an ARS sequence described by Stinchcomb 
et al„ Nature, 282:39 (1979). A partial list of plasmids capable of transfonning 
yeast includes YIPS, YRPl 7 and YEP24. These plasmids are disclosed and 
described by Botstein and Davis, The Molecular Biology of the Yeast 
25 Saccharomyces, Metabolism and Gene Expression (ed. Stratiiem et al.), (Cold 
Spring Harbor Laboratory, Cold Spring Hariwr, N.Y., 1982), at page 607. 

Since most recombination sites are asymmetrical nucleotide sequences, 
two recombination sites on tfie same DNA molecule can have the same or 
opposite orientations vritii respect to each other. Recombinations between 
30 recombination sites in tiie same orientation result in a deletion of die DNA 

segment located between the two recombination sites and a connection between 
the resulting ends of the original DNA molecule. The deleted DNA segment 
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g forms a circular molecule of DN A. The original DN A molecule and the 

resulting circular molecule each contain a single recombination site (see Figure 
2). Recombination between recombination sites in opposite orientations on the 
same DNA molecule result in an inversion of the nucleotide sequence of the 
5 DNA segment located between the two recombination sites (see Figure 2). In 
addition, reciprocal exchange of DNA segments proximate to recombination 
sites located on two different DNA molecules can occur. All of these 
recombination events are catalyzed by recombinases, including the <Usclosed 

15 

variants and wild type recombinases. 
10 Recombination using the disclosed variant recombinases can be used in 

vitro to produce site-specific recombination of nucleic acid molecules. This is 
20 usefiil for a wide variety of manipulations that currentiy employ wild type 

recombinases or involve traditional restriction enzyme cleavage followed by 
tigation. Examples include recombination of libraries of DNA fragments into 
15 vectors or in desired structures, and labeling of DNA via recombination. 

Recombined DNA formed by in vitro recombination can then be introduced into 
cells. For example, constructs formed in vitro can be introduced into cells to 
resolve the structures formed in vitro or to select active structures. In particular, 
30 large concatemers of subject DNA and spacerA^ector fragments can be made, 

20 introduced into cells, and circularized into vector units in the cells. Such 
recombination could also be performed in vitro if desired. 

The disclosed variant recombinases can be used to label DNA by 

35 

recombining a DNA molecule of interest with a labeled DNA molecule. Use of 
a lecombinase for labeling is advantageous since it involves fewer steps than 
25 traditional labeling via DNA synthesis or ligation. These considerations are 
^ particularly important when large DNA molecules (over 20 kb) are to be labeled 

since such large molecules will fragment more the more they are manipulated. 

Recombination mediated by the disclosed variant recombinases and 
variant recombination sites can be used to manipulate a host cell genome as 
30 desired and simultaneously introduce a marker gene flanked by the recognition 
sites of a second recombinase. After selection, leading to an accumulation of 
cells carrying the desired genomic alteration, one could simply remove fhe 
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g marker gene by expression of the second site-specific recombinase. A large 

ntimber of recombinases suitable for this purpose exists in nature, including X 
Integrase (Int), yeast FIp, etc (Nunes-Doby el al., NucL Acids Res., 26:391-406 
(199S)). Variant recombinases ha\'ing different site specificity can also be used. 
'0 5 Since the disclosed variant recombinases recognize both wild type 

recombination sites and variant recombination sites that are not recognized by 
wild type recombinase, wild type recombinase and variant recombinases can be 
used to mediate sequential recombination between nucleic acids containing a 
combination of wild type recombination sites and variant recomb'mation sites. 
1 0 For example, generation of knockout animals and plants can be made more 
efficient by using a structure wild type recombination site-selectable marker- 
20 vn\d type recombination site-endogenous gene-variant recombination site 

(rather than the conventional wild type recombination site-selectable marker- 
wild type recombination site-endogenous gene-wild type recombination site). 
1 5 Such a structure allows the selectable marker to be removed by the action of 
wild type recombinase without disturbing the gene since wild type recombinase 
will not recognize the variant recombination site to any significant degree. The 
endogenous gene can then be removed later by the action of a variant 
30 recombinase since the disclosed variant recombinases recognize both wild type 

20 and variant recombination sites. 

In a preferred embodiment of the disclosed method, the first and second 
DNA sequences are introduced into cells connected by a pre-selected DNA 
segment. The segment can be a gene or any other sequence of 
deoxyribonucleotides of homologous, heterologous or synthetic origin. 
25 Preferably, the pre-selected DNA segment is a gene for a structural protein, an 
enzyme, or a regulatory molecule. If the first and second recombination sites 
have the same orientation, activation of the regulatory nucleotide sequence 
produces a deletion of the pre-selected DNA segment. If the first and second 
recombination sites have opposite orientation, activation of the regulatory 
^ 30 nucleotide sequence produces an inversion of the nucleotide sequence of the 

pre-selected DNA segment. 
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g If a fourth DNA sequence (containing the third recombination site) is 

also introduced into cells, it is preferred that the second and fourth DNA 
sequences be intioduced into cells connected by a second pre-selected DNA 
segment The second segment can be a gene or any other sequence of 
. 5 deoxyribonucleotides of homologous, heterologous or synthetic origin. 

Preferably, the second pre-seiected DNA segment is a gene for a structural 
protein, an enzyme, or a regulatory molecule. If the second and third 
recombination sites have the same orientation, activation of the regulatory 
nucleotide sequence produces a deletion of the second pre-selected DNA 
10 segment. If the second and third recombination sites have opposite orientation, 
activation of the regulatory nucleotide sequence produces an inversion of the 
20 . nucleotide sequence of the second pre-selected DNA segment. 

Combinations of wild type and variant recombination sites, and 
combinations of different orientations of the recombination sites, in DNA 
15 intioduced into cells can multiply recombination options. For example, if the 
first and second recombination sites are wild type recombination sites and the 
diiid recombination site is a variant recombination site (all in the same 
orientation) then wild type rccombinase can produce a deletion of the first pre- 
30 selected DNA segment (but not the second) and a variant rccombinase can 

20 produce a deletion of the first, second, or both pre-selected DNA segments. 

This arrangement allows sequential deletion of the first and second pre-selected 
DNA segments. 

If the first and second recombination sites are wild type recombination 
sites arid the tlurd recombination site is a variant recombinatioii site, and the 
25 first recombination site has the opposite orientation from the second and third 
recombination sites (which, of course, have the same orientation) then vwld type 
rccombinase can produce an inversion of the first pre-selected DNA segment 
and a variant rccombinase can produce a deletion of the second pre-selected 
DNA s^ment (and/or produce an inversion of the first pre-selected DNA 
^ 30 segment or the entire section spanning the first, second, and third recombination 
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If the first and third recombination sites are wild type recombination 

5 

sites and the second recombination site is a variant recombination site« and the 
second recombination site has the opposite orientation from the first and third 
recombination sites (which, of course, have the same orientation) then wild type 
10 5 recombinase can produce a deletion of the entire section spanning the first, 

second, and third recombination sites, and a Cre variant can produce an 
inversion of the first, second, or both pre-selected DNA segments. 

If the first and third recombination sites are wild type recombination 
sites and the second recombination site is a variant recombination site, and the 
10 first recombination site has the opposite orientation from the second and third 
recombination sites (which, of course, have the same orientation) then wild type 
20 recombinase can produce an inversion of the entire section spanning the first, 

second, and thurd recombination sites, and a variant recombinase can produce a 
deletion of the second pre-selected DNA segments and an inversion of the first 
15 pre-selected DNA segment 

Many more combinations of wild type and variant recombination sites 
and or recombination site orientations are possible. For example, the variant 
recombinase can also be used with a different variant recombinase having a 
30 different site specificity rather than m\d type recombinase. The above 

20 examples illustrate the general principles involved in designing specific 

recombinations that may be desired. It should be understood that the above 
combinations of recombination sites can be extended to the use of more 
recombination sites (that is more than three) and more intervening, pre-selected 
DNA segments. 

25 For some uses of the disclosed recombinases, first, second, and fourth 

^ DNA sequences comprising a first lox site, a second lox site, and a third lox 

site, respectively, are introduced into cells. As used herein the expression "lox 
site" means a nucleotide sequence at which the gene product of the cre gene, 
lefened to herein as Cre, and/or the disclosied Cre variants, can catalyze a site- 
^5 30 specific recombination. LoxP site is a 34 base pair nucleotide sequence (Figure 

1) which can be directly synthesized or isolated from bacteriophage PI by 
methods known in the art. The Lox P site is an example of a wild type lox site. 
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^ . One method for isolating a LoxP site from bacteriophage P 1 is disclosed by 

Hoess et al., Proc. Natl. Acad Sci. USA, 79:3398-3402 (1982). The LoxP site 
consists of two 13 base pair inverted repeats separated by an 8 base pair spacer 
rc^on. The nucleotide sequences of the insert repeats and the spacer region are 
10 5 as follows. 

ATAACTTCGTATAATGTATGCTATACGAAGTTAT 
Other wild type lox sites include LoxB, LoxL and LoxR sites which are 
nucleotide sequences isolated from E. coli. These sequences are disclosed and 
described by Hoess et al., Proc, Natl Acad Sci. USA, 79:3398-3402 (1982). 
10 Preferred wild type lox sites are LoxP or LoxC2. Lox sites can also be produced 
by a variety of synthetic techniques which are known in the art. For example, 
20 synthetic techniques for producing lox sites are disclosed by Ito et al., Nuc. Acid 

Res., 10:1755 (1982) and Ogilvie et al.. Science, 214:270 (1981). 

The gene product of the ere gene is a recombinase herein designated 
15 "Cie" which effects site-specific recombination of DNA at lox sites. As used 
heieiii, the expression "ere gene" means a nucleotide sequence which codes for 
a gene product which effects site-specific recombination of DNA in cells at lox 
sites. One ere gene (the wild type ere gene) can be isolated from bacteriophage 
PI by methods known in the art. One method for isolating a ere gene from 
20 bacteriophage PI is disclosed by Abremski et al.. Cell, 32:1301-131 1 (1983). 

Genes engineered into cells for producing a foreign protein are often 
placed under the control of a highly active promoter, Tlie activity of the 
^ promoter can resuh in an overproduction of the protein wluch interferes with the 

growth of the en^neeied cell. This overproduction of the protein can make it 
25 difficult to grow the engineered cell in sufficient quantity to make protein 
production economically feasible. The present invention provides a method 
whereby engineered cells can be grown to a desired density prior to expressing 
the engineered gene. The engmeered gene is expressed, as desired, by 
activating a regulatory nucleotide sequence responsible for controlling 
45 30 expression ofDNA encoding a variant recombinase. Methods of controlling the 

expression of an engineered gene include Uie following: 
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(1) A DNA segment flanked by recombination sites in the same 
orientation is introduced into DNA in a cell between a promoter and an 
engineered gene to. render the promoter incapable of expressing the gene. A 
secoiid DNA sequence comprising a regulatory nucleotide sequence and DNA 
encoding a variant recombinase is also introduced in the DNA. After the 
engineered cells are %TOvm to a desired density, the regulatory nucleotide 
sequence is activated thereby effecting expression of the variant recombinase 
aiid producing a deletion of the DNA segment The engineered gene would 
then be expressed. 

(2) A gene for a regulatory molecule flanked by recombination sites in 
the same orientation is introduced into DNA in a cell. The regulatory molecule 
inhibits expression of an engineered gene. A second DNA sequence comprising 
a regulatory nucleotide sequence and DNA encoding a variant recombinase is 
also introduced into the DNA. After the engineered cells are grown to a desired 
density, the regulatory nucleotide sequence is activated thereby effecting 
expression of the variant recombinase and producing a deletion of the gene for 
the regulatory molecule. The engineered gene would then be expressed. 

(3) An engineered gene lacking a promoter and flanked by two 
recombination sites m opposite orientations is introduced into DNA in a cell 

20 such that the 3' end of the gene lies adjacent to the transcription start site of a 
regulatory nucleotide sequence. A second DNA sequence comprising a 
regulatory nucleotide sequence and DNA encoding a variant recombinase is 
also introduced into the DNA. Since the engineered gene would be transcribed 
in the antisense direction, no engineered protein would be produced. After the 

25 engineered cell is grown to a desired density, the regulatory nucleotide sequence 
is activated thereby effecting expression of the variant recombinase and 
prtKlucing an inversion of the desired gene. The engineered gene could then be 
transcribed in the proper direction and expressed. 

Numerous methods and techiiiques have been developed for the use of 

30 Cre recombinase and other, similar recombinases such as FLP. The disclosed 
wiant recombinases can also be used in any of these methods. Adaptation of 
these methods to the use of the disclosed variant recombinases is 
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straightforward. Generally, all thai is required is substitution of a variant 
recombinase (or a gene expressing a variant recombinase) for the original 
lecombinase (or recombinase gene) and, if appropriate, substitution of variant 
(or wild type) recorabinatioii sites for the original recombination sites used in 
the method. 

Examples of methods involving wild type recombinases and wild type 
recombination sites that can be adapted for use with the disclosed variant 
recombinases and recombination sites include recombination of DNA m phage 
packaging systems, recombination of DNA to fomi phage display libraries (for 
example, Fisch et ai, Proc Natl Acad Sci USA 93(15):776l^ (1996), and 
Waterhouse et a/.. Nucleic Acids Res 21(9):2265-6 (1993). and other uses (for 
example, (Sauer et al, Proc. Natl Acad ScL USA 84: 9108-91 12 (1987), 
MuUins et al^ Nucleic Acids Res 25(1 2):2539-40 (1997). Aoki et al„ Mol Med 
5(4):224-31(1999)). 

Other examples of specific methods in which the disclosed variant 
recombinases can be used or substituted include methods disclosed in U.S. 
Patent No. 5,888,981. U.S. Patent No. 5,888,732. U.S. Patent No. 5.885,836, 
U.S, Patent No. 5,885,793, U.S. Patent No. 5,885,779, U.S. Patent No. 
5,885,776, U.S. Patent No. 5.882,893, U.S. Patent No. 5,882,888. U.S. Patent 
20 No. 5,877,400, U.S. Patent No. 5,871,907, U.S. Patent No. 5,866,755, U.S. 
Patent No. 5,866,361, U.S. Patent No. 5,859.3 10, U.S. Patent No. 5,858,657, 
U.S. Patent No. 5.854,067, U.S, Patent No. 5,851.808, U.S. Patent No, 
5,849.995, U.S. Patent No. 5.849.989, U.S. Patent No. 5,849,708, U.S. Patent 
No. 5,849,572. U.S. Patent No. 5,849,571 , U.S. Patent No, 5,849,553, U.S. 
25 Patent No. 5,844,079, U.S. Patent No. 5,843,744, U.S. Patent No. 5,843,742, 
U.S. Patent No. 5,843,694, U.S. Patent No. 5,840.540, U.S. Patent No. 
5,837,844, U.S. Patent No. 5.837,242, U.S. Patent No. 5,834,202, U.S. Patent 
No. 5,830.729, U.S. Patent No. 5,830,698, U.S. Patent No. 5,830.461. U,S. 
PatentNo. 5,817,492, U.S. Patent No, 5,814,618, U.S. Patent No. 5,814,500, 
30 U.S. Patent No. 5,8 1 4,300, U.S. Patent No. 5,807,995, U.S. Patent No. 

5,807,708, U.S. Patent No. 5.801,030, U.S. Patent No. 5,800,998, U.S. Patent 
No. 5.795,734, U.S. Patent No. 5.795.726. U.S. Patent No. 5,792,833, U.S. 
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Patent No. 5,792,632. U.S. Patent No. 5,789.156, U.S. Patent No. 5.777.194, 
U.S. Patent No. 5,776,449, U.S. Patent No. 5.773.697. U.S. Patent No. 
5,770,384. U .S. Patent No. 5,767,376, U.S. Patent No. 5,763.240. U.S. Patent 
No. 5,756.671, U.S. Patent No. 5,744,343. U.S. Patent No. 5,744,336. U.S. 
5 Paten^No. 5,736.377, U.S. Patent No. 5.733.744. U.S. Patent No. 5,733,743. 
U.S. Patent No. 5,733.733, U.S. Patent No. 5,731,182, U.S. Patent No. 
5.723,765, U.S. Patent No. 5,723.333, U.S. Patent No. 5.723,287, U.S. Patent 
No. 5,721,367. U.S. Patent No. 5,721,118. U.S. Patent No. 5,700.470. U.S. 

Patent No. 5.686.595. U.S. Patent No. 5.679.523. U.S. Patent No. 5.677.177. 
10 U.S. Patent No. 5.658.772. U.S. PatentNo. 5.656,438, U.S. Patent No. 

5.654.182, U.S. PatentNo. 5,654,168, U.S. PatentNo. 5,650,491, U.S. Patent 
No. 5,650,308, U.S. PatentNo. 5,650,298, U.S. PatentNo. 5.643.727. U.S. 
PatentNo. 5.641.866, U.S. PatentNo. 5.641,748. U.S. PatentNo. 5.639.726. 
U.S. PatentNo. 5.635.381. U.S. Patent No. 5.629.179. U.S. PatenlNo. 
15 5.629.159. U.S. Patent No. 5.614.389. U.S. PatentNo. 5.612,205, U.S. Patent 
No. 5,596,089, U.S. Patent No. 5,591.609, U.S. PatentNo. 5,589,362, U.S. 
PatentNo. 5,539,094. U.S. Patent No. 5.530.191. U.S. PatentNo. 5,527,695, 
U.S. Patent No. 5,510,099, VS. PatentNo. 5.478,73 1.U.S. PatentNo. 
5,441.884. U.S. Patent No. 5.434.066. U.S. PatentNo. 5378.618. U.S. Patent 
20 No.5.354,668.U.S.PatentNo.5334,515,U.S.PatentNo. 5.300,431, and U.S. 

PatentNo. 4,959317. 

1. Use of Variants Recombinases in Plante and Plant Cells 
Methods for introducing a DN A sequence into plant cells are known in 
the art Nucleic acids can generally be introduced into plant protoplasts, with or 
25 without the aid of electiopoiation, polyethylene glycol, or other processes 
known to alter membrane penneability. Nucleic acid constructs can also be 
introduced into plants using vectors comprising part of the Ti- or Ri-plasmid, a 
plant virus, or an autonomously replicating sequence. Nucleic acid constructs 
can also be introduced into plants by microinjection or by high-velocity 
30 miciupiojectiles, also termed "particle bombardment" or "biolistics" (Sanford. J. 
C, Ttbtech 6: 299 (1988)), directly into various plant parts. TTie preferred 
mians of introducing a nucleic acid fragment into plant cells involves the use of 
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J A. tumefaciens containing the nucleic acid fragment between T-DNA borders 

either on a disarmed Ti-plasmid (that is, a Ti-plasmid from which the genes for 
tumorigenicity have been deleted) or in a binary vector in trans to a disanned 
Ti-plasmid. The Agrobactcrium can be used to transfonn plants by inoculation 
10 5 of tissue explarits, sudi as stems, roots, or leaf discs, by co-cultivation with 

plant protoplasts, or by inoculation of seeds or wounded plant parts. , 

Foreign genes can be introduced into a wide range of crop species. 
Thus, the disclosed variant recombinascs and method are applicable to a broad 
range of agronomically or horticulturally useful plants. The particular method 
10 wluch is employed to introduce the DNA sequence into a selected plant cell is 
not critical. In a preferred embodiment, DNA sequences are introduced into 
20 plant ceils by co-cultivation of leaf discs with A. tumefaciens essentially as 

described by Hoisch et al.. Science, 227: 1229-1231 (1985) omitting the nurse 
cultures. 

15 In the present method, the recombination sites are contacted with a 

variant recombinase, thereby producing the site specific recombination. In one 
embodiment, a variant recombinase, or messenger RNA encoding a variant 
recombinase, is introduced into the cells ditecdy by micro-injection, blolistics, 
30 or other piotein or RNA introductiori procedure. In a preferred embodiment, 

20 DNA encoding the variant recombinase is introduced into the plant cell under 
the control of a promoter that is active in plant cells. Suitable regulatory 
nucleotide sequences are known in the art. The promoter which is employed 
with a selected plant cell is not critical to the method of the invention. A partial 
list of suitable promoters include the 35S promoter of cauliflower mosaic virus 
25 described by Odell et al., Nature,3U: 810-812 (1985); the promoter from the 
nopaiine synthase gene of ^4. tumefaciens described by Depicker et al., J, of 
MoL Appl. Genet., 1 : 561-573 (1982); the promoter from a Rubisco small 
subunit gene described by Mazur and Chui, Nucleic Acids Research 13: 2373- 
2386 (1985); the 1' or 2' promoter from the TR-DNA of A. tumefaciens 
^ 30 described by Velten et al., EMBOJ^ 12: 2723-2730 (1984); the promoter of a 

chlorophyll a/b binding protein gene described by Dunsmuir et al., J. MoL Appl, 
Genet. 2: 285-300 (1983); the promoter of a soybean seed storage protein gene 



25 



35 



40 



50 



37 



55 



WOOD/60091 PCrAJSO0/091S4 

described by Chen ei al., Proc. Natl. Acad. Sci USA, 83: 8560-8564 (1986); and 
the promoter from the wheat EM gene described by Marcotte et al., Nature 335: 
454-457 ( 1 988). Variant recombinases can be expressed throughom the plant 
generally in all cells at all stages of development, or expression of variant 
5 recombinases can be more specifically controlled through the use of promoters 
or regulatory nucleotide sequences having limited expression characteristics. 
Variant recombinases can be expressed in a tissue specific manner, for example 
only in roots, leaves, or certain flower parts. Variant recombinases can be 
expressed in a developmentally specific time period, for example only during 
10 seed formation or d^ng reproductive cell formation. Expression of variant 
recombinases can also be placed under the control of a promoter that can be 
regulated by application of an inducer. In this case expression is off or very low 
until the external inducer is applied. Promoters active in plant cells have been 
described that are inducible by heat shock (Gurley et al., MoL Cell. Biol. 6: 559- 
15 565 (1986)), ethylene (Bfoglie el al., Plant Cell I: 599-607 (1989)), auxin 
(Hagan and Guilfoyle, Mol Cell Biol. 5: 1197-1203 (1985)), abscisic acid 
(Marcotte et al.. Nature 335: 454-457 (1988)), salicylic acid (EPO 332104A2 
and EPO 337532A1 ), and substituted benzenesulfonamide safencrs (WO 
90/11361). Control of expression of variant recombinases by the safener- 
20 inducible promoter 2-2, or its derivatives, allows the expression to be turned on 
only when the inducing chemical is applied and not in response to 
environmental or phytohormonal stimuli. Thus expression can be initiated at 
any desired time in the plant life cycle. Preferably, the regulatory nucleotide 
sequence is a 35S promoter or a 2-2 promoter. The above techniques and 
25 materials can also be used to express wild type recombinase in plant cells if 
required by the particular recombination pattern to be accomplished. 

One application of the disclosed variant recombinases is in controlling 
male fertility in a method for producing hybrid crops. Hybridization of a crop 
involves the crossing of two different lines to produce hybrid seed fiom which 
30 the crop plants are grown. Hybrid crops are superior in that more of the desired 
traits can be introduced into the production plants. For instance, quality traits 
such as oU content, herbicide resistance, disease resistance, adapubility to 
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environmental conditions, and the like, can be hybridized in offspring so that 
the latter are invested with the most desirable traits of its parents. In addition, 
progeny from a hybrid cross may possess new qualities resulting from the 
combination of the two parental types, such as yield enhancemem resulting 
5 ftom the phenomenon known as heterosis. Contiolled ooss-fertUization to 
produce hybrid seeds has been difficult to achieve commercially due to 
competing self-fertilization, which occurs in most crop plants. 

Hybrid seed production is typically performed by one of the following 
means: (a) mechanically removing or covering the male organs to prevem self- 
10 fertilization followed by exposing the male-disabled plants to plants with male 
organs that contain the trait(s) desired for crossing; (b) growing genetically 
male-sterile plants in the presence of plants with fertile male organs tliat contain 
the trait that is desired for crossing; or (c) tteating plants with chemical 
hybridizing agents (CHA) that selectively sterilize male organs followed by 
15 exposing the male-disabled pUmts to plants with fertile male organs that contain 
the trait that is desired for crosang. Some disadvantages to each of these 
methods include: (a) applicability only to a few crops, such as com, ■ft'here the 
male and female organs are well separated; and it is labor intensive and costly; 
(b) genetically male sterile lines are cumbersome to mmntain, requiring crosses 
20 with restorer lines; (c) all CHAs exhibit some degree of general phytotoxicity 
and female fHtiUty reduction. Also CHAs often show different degrees of 
effectiveness toward diffeiem crop spedes, or even toward differem varieties 
widiia the same species. 

A molecular genetic approach to hybrid crop production ^pKcable to a 
25 wide range of crops and involves genetic male sterility is described in EPA 89- 
344029. This system involves the introduction of a cell disruption gene that is 
expressed only in the tapetal tissue of anthers thereby destroying the developing 
pollen. The resulting genetically male sterile plants serve as the female parents 
in the cross to produce hybrid seed. Tliis system could be highly effective and 
30 desirable. However one disadvantage is Aat since the male sterile parent is 
heterozygous for flie sterility gene which acts as a dominant trait, only 50% of 
tfie plants grown fiom the hybrid seed are fertile, the rest retain the sterility 
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gene. This situation will result in reduced pollen shed in the production field 
which may lead lo reduced seed set and yield. Addition of recombinase 
technolog}' to this hybrid scheme allows restoration of fertility to a much higher 
percentage of plants in the production field, as well as elimination of the cell 
5 disruption gene. Placing the male sterility gene between recombination sites 
allows it to be deleted following introduction of a variant recombinase into the 
hybrid from the male parent. 

Another application of the disclosed variant recombinases is in making 
seedless produce. Seedlessness is desirable in consumed produce for 
10 convenience and taste. Currently "seedless" watermelon is sold that actually 
contains some developed seed and a large number of immature seed that varies 
in size up to that of fully mature seed. To produce these watermelon first a 
hybrid cross is made between a tetrapioid maternal parent and a diploid 
pollinator. The resulting triploid seed produces self-infcrtile plants that are 
15 crossed with a diploid pottinator to produce seedless fhut (Kihara, Proc Soc. 
Hort Sci, 58: 217-230. (1951)). This production scheme suffers the following 
problems: (i) Creating a tetrapioid plant, which is accomphshed by a 
chromosome duplication method, is difficult Also the number of seeds per fruit 
on this tetrapioid plant must be low since this has a positive correlation vnih 
20 seed number in the final product (Andrus, Production of Seedless Watermelons, 
USDA Tech. Bull. No. 1425 (1971)); (ii) good combining ability of the diploid 
pollinator and the tetrapioid plant is difficult to achieve (Henderson, J. Amer. 
Soc. Hon. Sci, 102: 293-297 (1977)); (iii) the triploid seeds are much inferior 
to regular diploid seeds in vigor and genninability O^aynard, Hort ScL, 24: 
25 603-604 (1989)). These problems, together with incomplete seedlessness in the 
final product, make the development of seedless watermelon slow and difficult. 
This ploidy-based approach to seedlessness is possible only in those few species 
where unusual euploidy plants (tetrapioid and triploid for watermelon, for 
example) are >dablc. 

30 A molecular genetic approach to seedlessness involving the disclosed 

variant recombinases is much more efficient, resulting in a more reliably 
seedless product and does not involve changes in ploidy. Thus it is more 
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gencraHy applicable to a wider range of species. A recombination site/polyA- 
inactivated cell disruption gene regulated by a seed-specific promoter is 
introduced into a plant. When this plant is crossed to a plant expressing a 
variant recombinase, the disruption gene is acUvated and expressed in the seed. 
5 thereby disrupting seed development. Thecertainty of endosperm failure 

(caused by the eel! disruption gene product) leading to the abortion of the whole 
seed is very high. In most dicots, the endospemi supplies the nutrients needed 
for early embryo development Endosperm abortion invariably leads to seed 
abortion (Brink and Cooper, Bot. Rev. 8: 423-541 (1947)). 
10 The seed-specific promoter used can be selected from the group of 

promoters known to direct expression in the embryo and/or the endosperm of 
the developing seed, most desirably in the endosperm. Examples of seed- 
specific promoters include but are not limited to the promoters of seed storage 
proteins. The seed storage proteins are strictly regulated, being expressed 
15 almost exclusively in seeds in a highly tissue-specific and stage-specific manner 
(Higgins et ^.,Ami Rev, Plant Physiol 35: 191-221 (1984); Goldberg et al.. 
Cell 56: 149-160 (1989)). Also, different seed storage proteins may be 
expressed at different stages of seed development and in different parts of the 
seed. 

20 There are numerous examples of seed-specific expression of seed 

storage protein genes in transgenic dicotyledonous plants. These include genes 
from dicotyledonous plants for bean (i-phaseolin (Sengupta-Goplaian et al., 
Proc. Natl Acad Scu USA 82: 3320-3324 (1985) and Hoffman et al.. Plant 
Mol, Biol. 1 1: 717-729 (1988)), bean lectin (Voelker et al., EMB0J6: 3571- 
25 3577 (1987), soybean lectin (Ocamuro et al., Proc, Natl Acad Scl USA 83: 

8240-8344 (1986)). soybean kunltz trypsin inhibitor (Perez-Grau and Goldberg 
Plant Cell 1: 1095-1 109 (1989)), soybean p-conglycinin (Beachy et al., EMBOJ 
4: 3047-3053 (1985), Barker et al., Proc. Natl Acad ScL 85: 458-462 (1988), 
Chen et al., EMBO J7: 297-302 (1988X Chen et al.. Dev. Genet, 10: 1 12-122 
30 (1989),Naitoetal.,P/£iitfMo/. Biol 11: 683-695 (1988)), pea viciUin (Higgins 
ctaL, Plant Mol Biol 11: 109-123 (1988)), pea convicilUin (Newbigin et al^ 
Planta 180: 461 (1990)), pea legumin (Shirsat et al., Mol Gen, Genetics 215: 
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^ 326 (1989)), rapeseed napin (Radke et al, Theor Appl Genet. 75: 685-694 

(1988) ), as well as genes from monocotyledonous plants such as for maize 15- 
kdzein {Hoffman et^,£AfB0y6: 3213-3221 (1987)), barley p-hordem 
Qyiaiiis ct al., Planl Moi Biol, 10: 359-366 (1988)X and wheat glutcnin (Clolot 

5 etal.,£AfffOJ6: 3559-3564(1987)), Moreover, promoters of seed-specific 
genes operably linked to heterologous coding regions in chimeric gene 
constructions also maintain their temporal and spatial expression pattern in 
transgenic plants. Such examples include Arabidopsis thaliana 2S seed storage 
protein gene promoter to express enkephalin peptides in Arabidopsis and 
10 Brassica napus seeds (Vandekerckhove ct al., Bio/Technology 7: 929-932 

(1989) ), bean lectin and bean p-phaseo!in promoters to express luciferase 
20 (Riggs et al.. Plant Sci. 63: 47-57 (1989)), and wheat glutenin promoters to 

express chloramphenicol acetyl transferase (Colot et al., EMBO J. 6: 3559-3564 
(1 987)). Promoters highly expressed early in endosperm development are most 
1 5 effective in this application. Of panicular interest is the promoter from the a 
subunit of the soybean P-conglycinin gene (Walling et al., Froc. Natl Acad, 
Sci. USA 83 : 21 23-2 127 ( 1 986)) which is expressed early in seed development 
m the endosperm and the embryo. 
30 The cell disruption gene used can be selected from a group of genes 

20 encoding products that disrupt normal functioning of cells. There are many 
proteins that are toxic to cells when expressed in an unnatural situation. 
Examples include the genes for the restriction enzyme EcoRI (Bames and Rine, 
Proc. Natl. Acad Sci. USA 82: 1354-1358 (1985)), diphtheria toxin A 
(Yamaizumi et al.. Cell 15: 245-250 (1987)), streptavidin (Sano and Cantor, 
25 Proc. Nail Acad. Set USA 87: 142-146 (1990)), and bamase (Paddon and 
40 Hartley, Gene 53: 1 1 -19 (1 987)). Most preferred for this system is the coding 

region of bamase which has been shown to be highly effective in disrupting the 
function of plant cells (EPA 89-344029). 

A highly desirable seedless system is one in which fully fertile Fl seed 
30 develops, that can then be grown into plants that produce only seedless fruit. 

This system is economically favorable in that for each cross pollination, a large 
number of seedless fiiiits result: the number of F I seed from one cross X the 
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g number of fruits produced on an F I plant. Also incorporated in this scheme are 

the advantages of growing a hybrid crop, including the combining of more 
valuable traits and hybrid vigor. This is accomplished in the same manner as 
described above except that the recombuiation site/polyA-inactivated disruption 
5 gene is expressed from a seed maternal tissue (seed coat or nucellus>-specific 
promoter. For example, the seed coat is the outgrowth of the mteguments, a 
strictly maternal tissue. Therefore the hybrid cross that brings the 
^ _ recombination site/polyA-inactivated disruption gene together with the 

recombinase gene does not involve this seed coat tissue. The seed coat of the Fl 
10 seed has either recombination sites or recombinase, depending on which is used 
as the female parent, and thus Fl seed develop normally. After the F 1 seed 
20 gives rise to a fruit-bearing Fl plant, all vegetative cells (including seed coat 

cells) inherit both recombination sites and recombinase from the embryo. Thus 
the seed coat of the Fl plant has an activated cell disruption gene. 
1 5 The seed coat is an essential tissue for seed development and viability. 

When the seed is fully matured, the seed coat serves as a protective kiyer to 
inner parts of the seed. During seed development, the seed coat is a vital 
nutrient-importing tissue for the developing embryo. The seed is nutritionally 
3a "parasitic" to the mother plant. All raw materials necessary for seed growth 

20 must be imported. In seeds of dicotyledonous plants, the vascular tissue enters 
the seed through the funiculus and then anastamoses in the seed coat tissue. 
There is no vascular tissue connection or plasmodesmata linkage between the 
seed coat and the embryo. Therefore, all nutrient solutes delivered into the 
developing seed must be unloaded inside the seed coat and then move by 
25 diffusion to the embryo. Techniques have been developed to study the nutrient 
40 composition in the seed coat (Hsu et al.. Plant PhysioL 75 : 1 8 1 ( 1 984); Thome 

& Rainbird, Plant PhysioL 72: 268 (1983); Patrick, J. Plant Physiol 1 15: 297 
(1984); Wolswinkel & Anunerlaan, J, Exp, Bat. 36: 359 (1985)), and also the 
detailed cellular mechanisms of solute unloading (Ofller & Patrick, Aust J. 
30 Plant Physiol. 1 1: 79 (1984); Patrick, PhysioL Plant 78: 298 (1990)). It is 

obvious that the destruction of this vital nutrient-funnelling tissue causes seed 
abortion. 
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The disclosed tissue-specific and site-directed DNA recombination can 
be used to obtain seedless fruil production. This method is useful for the 
production of seedless watermelon, for example. A combination of gene 
expression specific for maternally inherited seed tissue and the disclosed 
rccombinase system can be used for the production of seedless watermelon. 
The system can be universally applied to any horticultural crop in which the 
presence of seeds is undesirable and difficult to be eliminated through 
conventional brecd'uig methods. The system also allows the normal production 
of Fl seeds. The ability to maintain heterosis is an advantage of producing F2 
seedless firuits. 

The existing production of seedless watermelon indicates that seed 
development is not essential for the watermelon fruit development. However, 
conventional production of seedless watermelon using the ploidy imbalance 
trick has never been very popular due to the difficulty of overcoming the yield 
and production problems. Creating and maintaining the tetraploid (4n) female 
geilnline, and producing the triploid (3n) seeds have made the seed cost high. 
Cross-pollination is needed for the production of triploid seeds (4n X 2n) and 
seedless fruits (3n X 2n). Also triploid seed germination is usually poor due to 
ploidy imbalance. 

20 The present approach eliminates the dependence on polyploid geimlines 

and provides an efficient system for producing seedless fruit. The products of 
double fertilization of higher plants are the embryo and endosperm. The seed 
coat (including the integumentary tapetum) and nucellus (the tissue 
encompassing Ae embryo sac) are the remaining seed tissues that are 

25 maternally inherited. In addition to general protection, the seed coat and 

nucellus also play an important role in importing nutrients into the developing 
embryo and endosperm. Seed development will be aborted if this vital nutrient- 
importing mechanism of the seed coat/nucellus is debilitated. This will be 
accomplished by using the rccombinase system to activate a cell-damaging gene 

30 only in these tissues. Controlling the gene activation in a maternal tissue- 
specific manner allows production of normal Fl seed, but abortion of F2 seed. 
A seed coat or nucellus promoter is coupled to a tissue-destructive (lethal) gene 
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5 in order to prevent seeds from forming. The destructive gene is inactive in the 

seed parent due to die presence of a blocking transcription terminator. The 
terminator is flanked by recombination sites for subsequent excision by a 
lecombinase-mediated recombination event. £3q[>ression of the rccombinasc is 
5 also controlled by the seed coat/nucellus-specific promoter. When plants 
canying the separate recombinase and recombination site constructs are 
crossed, the Fl seed will be viable because seed coat/nucelius is maternal tissue, 
and in that tissue recombinase and recombination sites are not combined. When 
the Fl seed is used as planting seed, the self-pollinated or out-crossed plants 
1 0 will produce seedless fruits or vegetables, since in seed coat/nucellus tissues 
recombinase and recombination sites are combined, and the lethal gene is 
20 activated. 

2. Use of Variant Recombinases for Phage Packaging 
The disclosed variant recombinases can also be used to aid in phage 
1 5 packaging. The cloning system described herein utilizes a headful in vitro 
packaging system to clone foreign DNA fragments as large as 95 kb Miiich 
permits the isolation of DNA fragments that are at least tvrice the size of those 
that can be obtained by lambda cosmid cloning. This increased cloning 
30 capaicity has the following utility: 

20 (1) Genes in the 45-95 kb size range and, more particularly, in the 70-95 

kb size range can now be directly cloned and genes in the 25-45 kb size range 
can be cloned more easily. 

(2) Chromosomal "walking" and "jumping" techniques can be speeded 
up by a factor of at least two and should be more accurate because of the 
25 reduced number of contiguous segments that need to be linked together. 
4Q (3) The cloning system of the invention is useful as a means for the 

delivery of DNA efficiently to bacteria which otherwise do not take up DNA 
from solution well. 

Specifically, the headful packaging system of this invention for cloning 
30 foreign DNA fragments as large as 95 kb comprises: 

(a) modifying vector DNA by inserting a stuffer fragment into a blunt 
end producing site which is proximal to a pac site; 
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g (b) digesting the product of step (a) lo produce two vector arms each of 

which contains (i) a blunt end, (ii) another end which is compatible with the 
foreign DNA fragment M^ch is to be cloned, and (iii) a recombination site; 

(c) ligating the foidgn DNA to the product of step (b) indthout 
5 generating cohcatemers; 

(d) reacting the product of step (c) witfi pac cleavage proficient extract 
and head-tail proficient extract wherein the ratio of large heads to small heads in 
the head-tail extract is at least 5:1; 

(e) infecting a bacterial strain expressing a variant recombinase with the 
1 0 {xoduct of step (d); and 

(f) recovering the cloned DNA. 

20 The term pac is a generic name which refers to the site needed to initiate 

packaging of DNA. The pac cleavage proficient extract contains the 
recognition proteins necessary to cleave the pac site and, thus» initiate 
1 5 packaging. The head-tail proficient extract contains the heads and tails needed 
to package the cloned DNA into a \arus particle. The term concatemer means a 
DNA molecule consisting of repeating units arranged in a head-to-tail 
configuration. The term stufiTer fragment refers to a DNA fragment which is 
30 insdted into the vector DNA at a unique site, and within which headful 

20 packaging is terminated. The terms bacteriophage and phage are used 
interchangeably herein. 

Aldiough many of the elements described herein pertain to the PI 

35 

bacteriophage cloning system, those skilled m the art will appreciate that, with 
the exception of the components needed to package DNA (pac and packaging 
25 extracts), many of the elements discussed below, such as plasmid replicon and a 
^ multicopy or lytic replicon, pertain to the recovery of packaged DNA and can 

be used to recover DNA in bacteria, such as E, coli, with other cloning systems, 
for example, bacteriophage, yeast, etc. 

Bacteriophages which are suitable to practice the invention must have a 
30 large head capacity and the elements necessary for packaging DNA must be 
defmed. For example, for phages P22 and Tl, which utilize headful packaging, 
the necessary packaging elements are defined. However, P22 and Tl do not 
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have a very large head capacity. On the other hand, for phage T4, which has a 

5 

large head capacity, the necessao' packaging elements have not been defined. 

The elements necessary for packaging DNA (!.e., an in vitro headful 
packa^dg system) are the follovnng: 
10 5 (I ) a unique site, pac, vihich is cleaved by recognition proteins; it is the 

pac cleavage proficient extract which contains the recognition proteins 
necessary to cleave the pac site; and 

(2) empty phage heads into vs^di the DNA is packaged until the head 
has been completely filled, then a cleavage event is triggered (the "headfixl" cut) 
10 which separates the packaged DNA away from the remaining components; it is 
the head-tail proficient extract which contains the heads and tails needed to 
20 package the clbiied DNA into virus particle. 

Although initiation of packaging is site-specific (cleavage of pac site 
initiates packaging), teraunation of packaging is not site-specific. In other 
15 words, no unique site is recognized, as packaging will tenmnate at whatever 
point the head has been filled. 

In the case of PI, the DNA substrate used in the packaging reaction 
during the viral life cycle is a concatemcr consisting of individual units of the 
PI chromosome arranged in a head-to-tail manner. Headful packaging, using 
20 either P 1 phage or any other phage, is a four step process: ( 1 ) In the first step a 
unique site, pac, is recognized and cleaved by the pac recognition proteins 
(PRPs); (2) DNA on one side of the cleavage is packaged into an empty phage 
head until the head has been completely filled; (3) a second cleavage event is 
then triggered (the "headfiil" cut) that separates the packaged DNA away fi-om 
25 the rest of the concatemer, and (4) initiation of a second round of DNA 

packaging fixym the free end generated by the previous "headfiil" cut-hence the 
term processive headful cutting. However, if a concatemer is not generated then 
processive headful packaging does not occur. 

The ends of the packaged PI DNA do not contain complementary 
30 single-stianded sequences, as do the ends of packaged bacteriophage lambda 
DNA, and consequentiy after PI DNA is injected into a bacterium its 
cyclization does not occur by strand annealing but ratiia: by recombination 
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bctwectthomologous sequences present at the ends of the molecule. Because of 
^ this circumstance, any vector that uses PI packaging, or for that matter any 

headful packaging mechanism, must devise a means of cyclizing the linear 
packaged DN A by recombination. Cyclizing is accomplished by incorporating 
10 5 recombinationsitesintotheveciorandusingadisclosedvariantrecombinaseto 

cyclize the DNA after injection into gram-negative bacterial strains expressing 
the variant recombinase. 

PI produces two head sizes, a big head that can accommodate 105-110 
kbofDNA, and a small bead Aat can accommodate no more than 45 kb of 
10 DNA. NormallytheratioofhigtosmaUheadsinaPl wUd-type infection is 
10:1. however, in the cm-2 mutant of PI used to prepare some of the packaging 
20 lysates described herein, the radio of head sizes is 1:1. The head-tail packaging 

lysate prepared from the cm mutant of PI contained the usual ratio of big to 
small heads which is about 10:1. TTus is the preferred lysate for preparing head- 
15 taUpackaging extract. ToensuiepackagingofDNA exclusively into the big 
25 phageheads,theDNAmustbebiggerthanthatwhichcanbeaccommodatedby 
the small heads. It is generally desired that there be a large excess of big heads. 
However, the ratio of large heads to small heads should not fill below a ratio of 

30 about 5:1. 

2Q Illustratioii 

The following illustration describes an example of how the disclosed 
method can be used to generate variant FLP recombinases with altered site 
35 specificity. As with other recombinase, the method preferably uses the 

. following components: 
25 I. An in vitro mutagenesis system; 

2- A recombinase expression plasraid that allows varymg levels of 
expression by a sunple environmental conUol (for example, by the presence of 
varying amounts of an inducer substance in the growth media, by temperature, 
or by osmolarity); 

45 30 3. An indicator/selector bacterial strain. The strain carries both an 

indicator recombination substrate for detecuon of recombination at the wild 
type recombination site and a second recombination substrate that allows 
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selection for recombinaseniumutohave gained the abmty to «coenizean^ 
pcrfbm recombination atatargetmutant recombination site(thatis.avariant 

recombination site). Importantly, the vvild type and target mutant sites are 
designed so that recombination between the mutant and v«ld type sites is 
5 blockedevenwithamutantiecombinasethatcanrecognizeboththewUdtype 

site and also the target mutant site. This design prevents unwanted 
recombination between the wildtype and target mutant recombination sites that 
could imerfere with either selection or detection of desired recombinalional 
omcomes. The block is imposed by designing the wildtype and mutant sites to 
10 have different spacer regions (that is, different compatibility sequences), for 
example, the normal "vrt" spacer for the wildtype recombination site, and an 
alternative spacer "Al" for the other recombination substrate. In an otherwise 
m>nmutant recombination site DNA recombination proceeds efficiently both for 
recombination sites having the wt spacer (that is. a recombination between two 
15 wt sites) and also for sites having the Al spacer (that is. a recombination 

between two Al sites). Yet, recombination between the Al site and the wt site 
is bloclced (that is. recombination between a wt site and an Al does not occur). 
This strategy is applicable to all recombinases that have a recombination target 
site displaying one or more recombinase binding sites (repeat elements) on each 
20 side of a spacer region in which recombination occurs (Nunes-Dfiby et al.. Nuct. 
Acids Res., 26:391-406 (1998)). Such sites display a requirement for homology 
in the spacer elements for optimal recombination activity and has been shown to 
be the case for members of the Int family of recomWnases, including Cre. 
lambda Int. and FLP (Craig. Ann. Rev. Genet. 22:77-105 (1988)). 
25 TTie prcferredin vitro mutagenesis systemisthat of Stemmer (Stemmer. 

Nalure 370:389-391 (1994)), or a variant of that strategy. After mutagenesis 
and assembly of fiagments into a fuU-lengtii FLP gene, it is cloned into the 

repression vector. 

The exjaession plasnud to be used can be any of the -inducible" 
30 expressionplasmidsavailableinbacteria. Forthisillusti^tiononeof thepBAD 

plasraids for £. coli was chosen that allows expression of recombinase by 
growth on arabinose(Gu2manetal.,y.B«cre«o/. 177.4121-4130(1995)). and 
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Which canbe turned off by gro^Mh on glucose (and no arabinose). For this 
illustration the expression plasmid carries the replication origin of pACYC . and 
the FLP lecombinase gene is under the control of the £ coli ara promoter 
region. In addition, the plasmid canies the selectable marker Cm' which 
5 confers resistance to the antibiotic chloramphenicol. Because the pACYC 
replicon is low copy, its use may be advantageous in preventing excessive 
expression of FLP. Alternatively, a higher copy repUcon could be used, such as 
that of ColEl . In that case the expression level of FLP must also be careftlly 
controlled using the inducer substance arabinose. 

The indicatoi/sclector bacteria carries two different reporter constructs 
forFLP-mediated recombination. Hie first reporter construct consists of two 
FRT sites (FLP recombination target; that is. the recombination site recognized 
by FLP rccombinase) in direct orientation (an excision substrate) and resides on 
a low copy replicon that is compatible with the FLP expression construct In 
15 this«(aniplethefirstsubstrateismtegmtedintothe£c<,Bgenome.Tltiscan 
« bedoncbyincorporatingtheFRTsubstrateontophagelambdaandthen 

constructing a lambda lysogen. Alternatively the FRT substrate could reside on 
a low-copy replicon that is compatible with that of the FLP expression vector 
30 and which has an additional selectable maricer. for example resistance to 

20 bleomycin. This FRT substrate carries two FRT sites in direct orientation 

flunking a gene whose presence can be easily monitored. In this illustration a 
constitutively expressing lacZ gene was used whose presence can be determined 
35 3unp,y by growing colonies plates containing X-gal, upon which tiiey will 

becomeblue in color. Loss of titclacZ gene by FLP-mediated recombination 
25 results in white colony fonnafion on X-gal plates. 

The indicator/selector bacterial sttain also carries a second FRT-like 
substrate. This is a plasmid element having two FRT-M sites in direct 
oriehtation flanking DNA sequences (STOP) that disallow expression of a 
downstimn selectable mariner. In this example, nonexpression of the selectable 
« 30 maikerisachicvedbyplacinggeneticelementsinthefollowingorder: 

constitutivepnimoter- FRT-M - STOP - FRT-M - 'neo. where 'neo indicates 
thepromoterless neo gene of Tn5. Hence tins cassette cannot express neo and 
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cells are sensitive to the antibiotic kanamycin. Excision of STOP by 
recombination at FRT-M is designed to permit expression of neo so that cells 
now become resistant to kanamycin. The plasmid carries an additional 
selectable marker, Ap' conferring resistance to ampicillio, to maintain presence 
5 of the plasmid in K colL The STOP sequence here is the strong transcriptional 
terminator rmBTlT2 (Liebkeeial. Nucleic Acids Res. 13:5515-5525 (1985)). 

Figure 17 shows wt FRT, FRT-Al, and FRT-M sites used in this 
illustration. Ahhough the wildtype FRT site displays three inverted repeat 
elements, recombinatioii proceeds efficiently with sites carrying two of these 
10 repeats in the inverted configuration shown (Jayaram, Proc, Natl Acad ScL 
USA 82:5875-5879 (1985)). Either the full or minimal site can be used since 
both are recombinationally functional. The FRT-AI site is designed to have an 
altered spacer but which is still fimctional for self X self recombination 
(Senecoff^ al.. 1 Biol Chem. 261:7380-7386 (1986)). The target FRT-M site 
15 is designed to carry symmetrical mutations in the repeat elements that disallow 
efficient FLP-mediated recombination (Senecoff et al., J. MoL Biol 201:405- 
421 (1988)), and also the spacer mutation of FRT-Al . 

ImportanUy, the FRT-M site differs from the FRT site in two ways. 
First, both of fte 13 bp inverted repeat elements (that is the recognition 
20 sequences) are mutated in a symmetrical manner such that the wt FLP enzyme 
does not catalyze recombination between two FRT-M sites, or does so only 
extremely poorly (< 0.1%). Second, the spacer region (that is, die compatibility 
sequence; the 8 bp region between the 13 bp inverted repeats) is replaced vnth 
an alternate spacer Al- The aitemate spacer when present in an otherwise wt 
25 FRT site, vrfiich we vwU call FRT-Al , is permissive for FLP-mediated 

recombination between two FRT-Al sites, but does not permit recombination 
between FRT-Al and FRT, Use of the FRT-M site which contains the 
heterologous spacer prevents FLP-mediated recombination between FRT and 
FRT-M by FLP or a mutant FLP piotem that might otherwise catalyze 
30 recombination between the wt FRT substrate and the mutant target FRT-M 
substrate. Unwanted recombination between the wildtype and target mutant 
recombination sites would decrease efficiency of the selection procedure by (a) 
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not limiting recombination at the target mutant site specifically to these sites 
and thus compromising the selection (at FRT-M sites) for mutant FLP 
rccombinases, (b) affecting the accuracy of the specific indication of actiNaty at 
the wt FRT sites, and (c) decreasing either the plasmid stebility of the FRT-M 
5 selector substrate or the integrity of bacterial chromosome (or compatible 
plasmid) carrying the vrt FRT sites. 

Procedure (Figure 1 8): the FLP gene is mutagenized in vitro and then 
cloned into the inducible expression vector, in this case a pBAD derivative that 
places FLP under the control of the arabinose-inducible pBAD promoter. The 
10 pool of mutagemzed FLP genes is transfomied into the FRT indicator/selector 
stisun which is pre-induced with arabinose and/or induced with arabinose during 
DNA transformation. Bacterial colonies are then selected to be simultaneously 
resistant to chloramphenicol (to retain the FLP expression plasmid), ampiciUin 
or caibeniciUm (to retain the selector plasmid) and kanamycin (to select for 
1 5 cells in vvhich FRT-M X FRT-M recombination has occuned) on agar plates 
containing either arabinose (for continued FLP expression) or glucose (to 
prevent prolonged FLP expression). In some instances it may be advantageous 
to limit FLP expression to better enrich for those FLP mutants that cither have 
more avidity for recombination at the FRT-M sites or to better exclude those 
20 FLP mutants that retain activity at the wildtypc FRT sites. This is because 
prolonged or Mgh-level FRT expression can lead to inefficient but detectable 
recombination at mutant sites. 

Either all Kan' colonies or only those tiiat are blue on X-gal plates are 
then pooled and harvested for DNA preparation. A second round of FRT gene 
25 mutagenesis and selection is then initiated by PCR amplification. Multiple 
rounds of mutagenesis and selection are used to obtain FRT mutants with 
altered site-specificity. Comparison of various individual isolates allows 
determination of critical andno acid residues that contribute to tiie desired 
mutant phenotype. 

30 Tlie same rationale and procedure can be used to generate a second class 

of altered FLP reoombinases. The target mutant FRT site used is. however, 
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different. In this case the target is the FRT-M2 site (Figure 19) ^v-hich carries a 
different binding site mutation(s) than does the FRT-M site as described above. 

Examples 

Example 1: Selection of Variant Cre Recomblnascs 
5 The following examine describes the production and analysis of some 

examples of the disclosed variant recombinases. Cre mutants characterized by a 
wider substrate recognition were created, applying a technique called directed 
molecular evolution: Multiple rounds of a random mutagenesis procedure 
(DNA shuffling; Stemmer, W. P. C. Froc. Natl Acad. Set USA, 91 :10747- 
10 10751 (1994)) and a sensitive selection for the desired phenotypes allow to 

accumulate candidate mutants vkdthin the generated pools of mutated sequences. 
The Cre mutants created in this example showed wt-like activity on hx? sites. 
In addition, they performed on an altered substrate, called loxKl, that is no 
recogmzed by the wt enayme. Two transversions from adenine (teP) to 
15 tiiymine (/oxK2) at positions 11' and 12' of the lox sequence are the barriers 
that inhibit wt Cre from recognizing loxK2: The two thymines are believed to 
cause repulsive forces vwth the acidic side chain of a glutamate residue in the J 
helbc of wt Cre (position 262). This glumtamate was found to be replaced by a 
glycine in all mutants with remarkably increased activity on loxKl, Additional 
20 site-directed mutagenesis experiments, confmed to the glutamate at position 262 
of Cre, could confirm that E262G but also E262W mutations alone are 
sufficient to increase /ojcK2 activity by a factor of 10^ without affecting hx? 
recognition. Other point mutations idaitified in tiie ansdyzed mutants may 
however be responsible for increasing the newly obtained specificity even 
25 further (1 0 fold compared to E262G alone). 

MATERIALS AND METHODS 
General Procedures 
Standard Reagents 

The following reagents were used in all experiments: lOx' TBE (Tris- 
30 Boratc-EDTA, pH 8.3) was purchased from Biofluids, Inc. (Rockville, MD) and 
diluted to Ix widi deionized water prior to use. TE (Tris-HCl 10 mM, EDTA 1 
mM. pH 8.0 and pH 7.5) and Tris-HCI (1 M, pH 7,5 and 8.0) came from 
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Quality Biological, Inc. (Gaithcrsburg, MD), as well as autoclaved LB (Luria- 
Bertani) and SOC broth. L-(+}-arabinose (>99%) was ordered from Sigma- 
Aldrich Fine Chemicals (St. Louis, MO) and anyhdrous D-glucose from 
Mallinckrodt Labaratoiy Chemicals (Phtllipsburg, NJ). 
S Gel Eleetrophoresis 

For DNA electrophoresis, 0.8% agarose TEE gels were used (GTG Sea 
Kem Agarose (FMC, Rockland, ME)). Gels were prestained with 0.25 jig/ml 
EtBr (Ethidium Bromide, 10 mg/ml (Life Technologies, Inc., Grand Island. 
NY)). The used electrophoresis apparatus was a DNA SUB CELL™ (BioRad, 
10 Hwcules, CA) with an OSP 105 (OWL, Wobum, MA) powersupply. Gels wore 
run at 60 V (5 V/cm) as recommended by Sambrook et al.. Cold Spring Harbor. 
New York: Cold Spring Harbor Laboratory Press (Second Edition) (1989). 
Occasionally, for small amounts of samples, 50 ml minigels were used under 
similar conditions (Hoefer HE33, Hoefer Scientific Instruments, San Francisco* 
1 5 CA). Molecular weight stondards were >. / Hind III digest (Research Genetics, 
Huntsville, AL) and Ready-Load**^ 100 bp DNA ladder (Life Technologies), 
Fffoviding a standard size range from 100 bp to 23130 bp (Figure 4). For 
standard fragmem purification from gel, the Gencclean II® Kit (BIO 101, Inc., 
La Jolla, CA (1 1/98) was used, followmg the manufacturer's instructions. 
20 Minipreps and Plasmids 

Plasmids for diagnostics, cloning, and sequence analysis were prepared 
using the Wizard*^ Minipreps Plus Kit (Promega, Wizard™ Minipres Plus 
DNA Purification System. Instruction Manual (Madison, WI) (1/96)). Usefiil 
ones were assigned a pBS number and stored in TE pH 8.0 at +4'*C. 
25 Oligonucleotides 

All oligonucleotides used as PCR primers, for plasmid construction, or 
in the mutagenesis procedure, were ordered from Midland, Inc. (Midland, TX) 
in gel filtration (GF) quality. The lyophilized oligonucleotides were assigned a 
BSD number, suspeaided in HPLC grad water (Sigma-Aldrich) at a final 
30 concentration of 300 (iM, and stored at -20X. 
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DNA Digests and Ligations 

All en2ymes used for DNA manipulations (restriction enzymes, T4 
ligase, etc.) were purchased from New England Biolabs, Inc., Catalog. 
(Beverly, MA) (1998/99) and used as recommended in the mamifecturer's 
5 catalog (1 998/99). Briefly, for restriction enzyme digests the total reaction 
volimie was 20 jil with approximately 10 tinits (U) of enzyme. For DNA 
fragment ligations, 10 ^il with 200 U of T4 DNA ligase were used. 
E. coH Strains 

All E. coli strains, except otherwise mentioned, were derived ori^Uy 
10 ftom DH5a: endAl hsdRll (fKmO supEAA ihi-\ recAl gyrA (Nal"^) rc/Al 
A(/flclZYA-flrgF) U169 deoK (M80 dlac A(/flcZ)Ml 5) (Woodcock et al., NucL 
Acids Res., 17:3469-3478 (1989): Raleigh et al., In Current Protocols in 
Molecular Biology, eds. AusubeK F.M. et. aL (New York: Publishing Associates 
and Wiley Interscience). Unit 1 .4 (1989). After modification with A. prophages 
15 or plasmids, strains were catalogued by assigning them a BS number and stored 
at -SO'C with 10% DMSO (Dimetiiylsulfoxide) after overnight culture in 
appropriate selection medium. 

Transformation of £1 coii 

For all piasmid transformations of E. coii strains, electroporation was 
20 preferred over chemical protocols. Electorcompetent cells were made and used 
for electroporation as described by Smitii et al., Fociw, 12:38-40 (1990). The 
appropriate cell porator and cuvettes were from Life Technologies. Depending 
on the selection procedure after electroporation, the time in SOC medium 
(Smith et al„ Focus, 12:38-40 (1990) at 37°C under agitation (Lab-Line® Orbit 
25 Enviion-Shaker, Lab Line Instruments, Inc., Mekose Park, IL) prior to plating 
on selection medium was 1 h for ampicillin (Ap) and 2 h or more h for 
kanamycin (Kan) selection. For induction of ere expression (as described 
below), tiie transformants were cultivated in SOB (Smith et al., Focus, 12:38-40 
(1990)) supplemented with 0.2% of L-(+)-arabinose (Sigma-Aldrich) plus 
30 20mM of MgCh (referred to as induction medium) for 2.5 h and 4 h before 
plating on the appropriate selection media (see below). Resulting colony 
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numbers were counted after ovemighl incubation at 3VC in a gravity 
convection incubator (Precision Scientific, Chicago, IL). 
£. coli Cultures 

LB (as mentioned above) was used as the standard medium for all R 
5 coli cultures (liquid or solid). For selection and screening, the appropriate 
reagents at the following concentrations were added: 

Table 1 : List of reagents used for selection and screening of £. coli 
cultures. 

Reagent Concentration Stock Solution 

10 Arapicimn(Ap) lOOjig/ml 50 mg/ml in H2O 

Chloramphenicol (Cm) 27ng/ml 34 mg/ml in EtOH 

Kanamycin (Kan) 16^g/ml 10 mg/ml in H2O 

X-gal 0.003% 2%inDMF(w/v) 

Z-ara 0.006% 2% in DMF (w/v) 

1 5 The concentration of stock solutions, stored at -20*C and their dilutions 

in liquid LB medium or LB-agar plates is given. X-gal stands for 5-bromo-4- 
chloro-3-indolyl-P-D-galactopyranoside, and Z-ara for 5-bromo-3-indolyl-a-L- 
arabinofuranoside. All reagents, except Z-ara, were purchased from Life 
Technologies and aliquoted in the desired stodc concentration for storage. Z-ara 
20 (Berlin and Saucr, Ami Biochem., 243: 171-175 (1 996)) was a generous gift 
fiom W. Berlin. 

Ready-to-use solid LB-agar plates (2%) plain, or siq)plemented with Ap 
(100 ng/ml) were purchased from Digene, Inc. (Beltsviile, MD). For all other 
reagent combinations in solid medium, plates were poured according to the 
25 needs using autoclaved 2% LB agar purchased from Biofluids. 
Polymerase Chain Reaction (PCR) 

Standard PCR reactions were canned out in 50 ^1 total volume with the 
following reagents (all, except noted, from Perldn Elmer, Foster City, CA): Ix 
PE buffer II (without MgCU), 2 mM MgCb, 250 ^iM of each dNTP, 0.8 ^M of 
30 each primer, ca. 50 ng of template DNA, qsp. H2O (HPLC grade, Sigma- 
Aldrich) to 49.5 ^l. For mutagenic PCR reactions (also referred to as eiior- 
pronc PCR), the amount of cadi dNTP was reduced to 20 fiM and 0.25 mM of 
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MnCl2 added. After denaturation at 9S°C for 5 min, 0.5 ^il of 5 PE Amlpi 
Taq Polymerase >vas added at approximately the anneaiing temperature of the 
primers. After mixing, the appropriate thermal cycles were carried out (as 
indicated individually below), usmg a PTC 200 thermal cycler (DNA Engine, 
5 MJ Research, Cambridge, MA). When finished, all PCR products immediately 
were loaded on an agarose gel, or separated from enzyme, nucleotides and 
primers by applying the Wizard^" PCR Preps Kit (Promega, Wizard^" Minipres 
Plus DNA Purification System. Instruction Manual (Madison, WI) (1/96)). 
PCR products were recovered using deionized water and stored frozen at -20*^. 
10 Sequence Analysis 

Sequence analysis of plasraid constructions and ere mutants were 
carried out on a PE ABI PRISM™ 3 1 0 Genetic Analyzer (Perkin Ehner) 
according to recommendations in the manufacturer's protocol P/N 402078" 
Revision A (1995) for the ABI PRISM™ Dye Terminator Cycle Sequencing Kit 
25 15 (Perkin Elmer). Briefly, a cycle sequencing reaction contained ca. 50 ng of 

template DNA in miniprep quality, 4 pmol of primer, and 8 ^1 of .the ABl 
Terminator Ready Reaction Mix (Perkin Elmer), in a total volimie of 20 ^1, and 
subjected to the following conditions: (96°C, 10 s; melting tempemtuie of 
primer, 1 5 s; eO'^C, 4 min) 26 times on a PTC 200 thermal cycler (MJ 
20 Research). Afler removal of residual primers and dye by ethanol precipitation, 
the DNA was resuspended in 25 ^1 of ABI Template Suppression Reagent 
35 (Peikm Elmer) and denatured at 95**C for 5 mm before loading the ABI Genetic 

Analyzer. The obtained data files were examined using ABI PRISM™ 
Sequencing Analysis VersionS.O Software (1 996, Perkin Elmer). Gene Jockey 
25 n (1 996, Biosoft, Cambridge, UK) software was used for sequence comparison, 

40 

translation, and alignments. 

Mutagenesis Prociediire 
Substrate Preparation by PGR 
45 The ere gene for the following DNase I shuffling reaction was amplified 

30 by PCR using 5' forward primer BSB436 (5' 

AAATAAICTAGACTGAGTGTGAAATGTCC 3') and 3' reverse primer 
BSB376 (5* ATATATAAGCEIATCATTTACGCGTTAATGG 3*), 
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introducing znXha 1 and Hind Ml cloning site, respectively (underlined). 
Mutagenic and non-mutagenic PCR's were carried out: (94°C, 30 s; 52*'C, 30 s; 
72X, 90 s) 45 or 30 times, respectively. The 5' primer was designed to include 
the endogenous Shine-Dalgamo (SD) of ere, v^ereas its three endogenous 
5 promoters were excluded Oposition -1 7, 6; positions refer to the adenine of the 
start codon of the ere coding sequence as position 1). Thus, after introducing 
the resulting ere genes uito pBAD33 (see below), expression was exclusively 
under control of the pB AD promoter without interference or background 
expression due to endogenous promoters. Including the SD sequence of ere 
1 0 was necessary, since pBAD33 does not contain this sequence 5' of its multiple- 
cloning-site ^CS). The 3' reverse primer was designed to be homologous to 
the 3' untranslated region (UTR) of ere (position 1057, 1032). Mutagenic 
events were therefore pennitted in 1020 bp of the entire 1 026 bp ere coding 
sequence, excluding the first two codons. For the first round of the directed 
1 5 evolution procedure, die wt ere expression plasmid pBS 1 85 (Sauer and 

25 

Henderson, The New BiologisU 2:441-449 (1990)) served as template. In 
following cycles, the pool of mutated ere genes from the previous round was 
used In all experiihents, both, mutagenic and non-mutagenic PCR's were 
^ carried out in parallel using the appropriate template. 

20 Homologous Recombination in Vitro 

DNase i digest 

Approximately 5 ^ig of the ere PGR product (ca. 1 .1 kb) wee digested 
witii 0.03 U of DNase I, type IV (Sigma-Aldrich) in 20 pi total volume of 50 
mM Tris-HCI pH 7.5 plus 1 mM MgCb for 2 to 3 minutes at room temperature. 
25 After digestion, samples inunediately were loaded on a 2% minigel to separate 
40 the generated fragments OFigure 5). Fragments of 25 bp to 300 bp were purified 

fiom the gel by DE81 (Whatman, Maidstone, GB) extraction and ethanol 
precipitation (Sambrook et al., Cold Spring Harbor, New York: Cold Spring 
Harbor Laboratory Press (Second Edition) (1989)). before suspending in 5 jjI of 
30 TEpH8.0. 
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Seif-Pnniing?a<. 

A 60 cycle non-mutagenic PGR (as described above) was carried out 
without added primers, allowing the fragments to prime themselves and thereby 
to undergo shuffling while reassembling. Conditions for PCR were: 94°C, 90 
5 s; (94«C. 30 s; 45'C, 30 s; 90 s) 60 times; 72*C, 10 min. 

Reassembling of ere 

Since the self-priming step never yielded a single size product but rather 
a range of fiagments between 300 bp to 2000 bp (Figure 5), the self-priming 
PCR mixture was diluted 1/40 in a non-mutagenlc PCR mix with primers 
1 0 BSB376 and BSB436 (see above), and subjected to an additional 20 cycles 

(94°C, 30 s; 52*^, 30 s; 72"C, 90 s). ITus additional step lead to one product of 
1.1 kb size (Figure 5). 
ere Expression 

After digesting the linkers of the generated mutant ere pool with I 

1 5 and Hind III, the fragments were ligated into &e identical sites of the MCS of 
the ere expression vector pBAD33 (Figure 6). Two features favored the choice 
of pBAD33 as the vector to express the mutant ere pool for the selection 
procedure (see below): First, its pACYC184 derived origin of replication is 
compatible with the ColEl derived ones of the plasmids used in the selection 

20 procedure. Second, pBAD33 contains &e promoter of the arabiniose operon 
(pBAD), as well as expresses the regulatory protein AraC. It is therefore 
possible to regulate the expression of a gene cloned into the MCS and under 
PBAD control, from moderately high levels to nearly complete regression, by 
smaply changing 0.2% L-(+)-aFabinose in the medium to 0.2% D-glucose 

25 (Guzman et al., J. BacterioL, 177:4121-4130 (1995) and Miyada et al., Proc. 

Natl. Acad. Set USA, 81 :4120-4124 (1984). As indicated above, the primers for 
ere PCR were designed in order to include the endogenous SD sequence but to 
exclude the three ere promoters, ere expression therefore will be under 
complete control of the pB AD promoter. This is unportant for the selection 

30 procedure (see below) diat was intended for few Cre molecules acting on 

different lox sites. High concratrations or long temi background expression of 
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ere could eventually defeat the selection since wt Cre also catalyzes at very low 
frequencies Tecombination events between altered lox sites. 

Piasmids uid E, coU Strains Used for Selection and Screen 

h4utant lox sites 

5 Figure I compares the original lox? site to the two mutant sites, /oxKl 

and /oxK2, used during the described experiments. The lox sites with 5' Sal I 
and Xho I compatible, and 3' Xba I and Nhe I compatible ends were received as 
single stranded oligonucleotides from Midland and annealed by heating the 
^propriate ones together at 70^^, followed by a gradual cool down. 

10 Piasmids for Selection and Screening 

Plasmid pBS561 was constructed usmg three fragments: (I) the 5' 
modified neo gene derived from pBS398 (Sauer et al.. Methods^ 4:143-149 
(1992), (ii) the RSVneo (Gorman et al.. Science, 221 -.551-553 (1983) backbone 
without the neo gene, and fiii) the oligonucleotide-derived MCS (Figure 7). 

1 S The EGFP gene derived from pEGFP-Nl (Clontech, Palo Alto, CA) was then 
inserted into the MCS along with 5 * and 3 ' lox sites orientated in the same 
direction to produce piasmids pBS568 {IoxYA}) and pBS569 (/ojcK2^). Figure 8 
summarizes the procedures used to construct piasmids pBS583 and pBS584. To 
restore the original neo reading frame without the 5* extension, the /oxK^ 

20 cassettes containing MIu VKpn 1-fragments from pBS568 and pBS569 were 

ligated mto the RSV neo backbone that contains Mlu I and Bgl II sites. The Bgl 
n - Kpn I junction was achieved by filling the 3* recessed end of Bgl II with 
Klenow (NEB) followed by a blunt-end ligation to the Kpn I end. This jtmction 
also was chedced by sequencing and found to be correct A transcriptional 

25 terminator, rmBTiTz, derived from pBAD33 by non-mutagenic PGR witfi 

primers BSB425 (5' ATAA GCGGCCGCT GAGCTTGGCTGTT TTGGCGG 
3') and BSB426 (5' GCCG TCTCGAGA GAGTTTGTAG 
AAACGCAAAAAGGC 3'), was inserted into the teK^ cassette 3' of the 
EGFP gene after digest of the Its Not I and A%o I linkers (undalined). With this 

30 constnict, it could be predicted that a catalyzed recombination by Kl^ or ¥X 
Cre mutants between the lox sites would result in the excision of EGFP and the 
transcriptional terminator, and thereby permit the transcription of the neo gene 



60 



WO00;60091 PCT/USC0A>9154 

due to the RSV promoter, located 5'. Expression of neo would no longer be 
impaired, because it is placed under control of 5' promoter elements present in 
RSVneo (a Kan^ rendering plasmid in £. coif)- 

A similar loxV^ cassette selection plasmid also was designed (pBS6l3, 
5 Figure 9), to be used as a control. Using this plasmid, the frequency of lox? 
recombination by Cre mutants could be detennined in the same manner as used 
to evaluate hxKl or /oxK2 recombination by pBS583 and pBS584. 

Finally, a completely different set of teKl^/K2^ cassette plasmids was 
created. These plasmids were no longer used for selecting mutants that 
10 recognize /oxKl or /oxK2, but rather were used to screen for these mutants in 
conjunction with a different bacterial background (see below). Figure 1 0 
summarizes the construction process for pBS601 {loxKl^) and pBS602 
(loxK2^): In a first step, the neo resistance marker of pBS581 and pBS582 
(intermediates in the construction of pBS583 and pBS584) was removed by 
15 deleting the Pvu II fragment and thus restoring the possibility to use neo for a 
different selection procedure. Following this, the EGFP gene between the lox 
sites was replaced by the pUC19 (Yanish-Perrbn, eL ah, 1985) derived lac 
promoter with its 3 ' MCS. This pUCl 9 fragment was obtained by non- 
mutagenic PGR with primers BSB448 (5' GTCAAGCIAGCT AGCAGGTTT 
20 CCCGACTGG 3') and BSB449 (5' ArATTfinnfiCCGCAGATCTCCTCTA 
GAGTCGACCTG 3'). An Nhe I site 5% and a 5gi H arid a Not I site 3' 
(underiined) were introduced thereby, which made it possible to replace the Nhe 
I -Not I EGFP fragment after linker digestion. The newly generated polylinker 
between the two lox sites permitted insertion of the Xba I - BamH I fragment of 
25 pB$481 (that carries the abf A,st marker gene) into it&Xba 1 and Bgl II sites. As 
shown recently, E. coli strains expresdng a recombinant a-L- 
arabinofuranosidase gene from Streptomyces Uvidans (abf A,st), can be detected 
by eye on LB plates containing 5-bromo-3-indolyl-a-L>arabinofuranoside (Z- 
ara) Berlin and Sauer, Anal. Biochem., 243:171-175 (1996). This leads to the 
30 formation of an indigo blue pigment, that is similar to the classical /flcZ/5- 
bromo-4-chloro-3-indolyl-P-D-galactopyranoside (X-gal) marker system. It 
could therefore be expected that E. coli strains expressing mutant Cre 
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recombinascs which allow recombination of /oxKl or loxK2 should lose the 
abf A.si gene and should form white colonics on Z-ara plates. All other clones, 
however, should be dark blue. Using a6/A.st instead of the well establi^cd 
lacZ screen was necessary, because the £ coli strain (see below) used fox this 
5 construct expresses P-galactoddase endogcnously. 
E. coli Strains for Selection and Screen 

The E. coli sxram BS583, DH5 Alac (X.D69 hx?\lacZ LEU2]), was 
chosen as the bacterial background for the selection procedure for Kl* or K2* 
Crc mutants by plasmids pBS583 or pBS584, Due to the lox?\lacZ] contaimng 

10 X prophage, Cre activity on lax? can be evaluated amply by using X-gal plates. 
The selection strains BS1493 and BS1494 were made by introducing the 
selection plasmids pBS583 and pBS584 into BS583 (Table 2). The lox?2 
plasmid pBS613, to be used as a control, needed to be in BS583 cells as well, 
becoming the strain BS1541 (Table 2). 

15 For the screening plasmids pBS601 and pBS602, E. coli strain NS2300 

(Sternberg et al., J. Mol Biol , 1 87: 197-212 (1986)) was selected as host: K12 
recA::TnlO loxV2[neo]y This strategy combines a kanamycin selection 
for Cre enzymes, that are no longer active on lox? (P ), with a screen for Kl* or 
Ya"" enzymes. By transforming pBS601 and pBS602 into NS2300. &e P" 

20 selection sttains BSl 523 and BS1524 were formed (Table 2). 
Selection and Screen for CRE Mutants 
Selection for Ki*/K2^ and Screen for F 

After ligation of the generated mutant cre pool into pBAD33 for 3 h, 
BS1493 or BS1494 electrocompetent cells were transformed with 2 yX of the 

25 miciodialyzed reaction mixture (VS membrane, Millipore™* Bedford, MA). To 
induce expression of the cre pool, the transformed cells were incubated at 37^ 
in induction medium for 2.5 h and/or 4 h under agitation (as mentioned before). 
Cultures were diluted 1/500 or 1/5000 and grown on LB plates with the 
following formulation: Ap, Cm, glucose, and X-gal for determining the 

30 transformation efficiency, referred to as non-selection plates. Dilutions of 1/5 
and occasionally 1/50 were grown on plates with addition of Kan. used to select 
for Kr or K2* mutants and called selection-plates. The formulation of the 
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plates served the following purposes: (i) Ap and Cm were added to assure that 
all clones contained both, selection and expression plasmid, (ii) X-gal to 
distinguish between P* and F clones (Table 2), After overnight incubation at 
3T*C, blue and white colonies were counted and pools prepared for the next 
5 round of DNA shuffling. Alternatively, certain mutants were chosen for further 
analysis (see below). 

Selection for F and Screen for KJ^ or KT^ 
After 2.5 h and/or 4 h of expression of the mutant ere pool in the 
transformed BS1523 and BS1524 cells, usually dilutions of 10*^ or lO"^ were 
10 grown on LB agar plates supplemented with the same reagents as listed above, 
except, that Z-ara replaced X-gal to allow the Kl^/K2* screen. Non-selection 
plates were used for determining the transformation efficiency, and Kan 
containing plates for the P" selection (Table 2). 
Mutant Analysis 
15 wt ere Expression Plasmid 

With fewer cycles ( 1 5) of non-muiagenic PGR on the ere expression 
plasmid pBSl 85 and after linker digestion, the ere pool obtained was cloned 
into pBAD33 and transfonned into BS583 cells. After 2 h of ere expression, 
the transformed cells were grown on X-gal plates. After overnight incubation at 
20 37"C, two white colonies (indicating lox? recombination) were picked for 

plasmid preparation and complete sequencing. No point mutation v^'as found in 
either one, so that each could be used as a control plasmid for wt Cre 
expression. One of the two was selected for further use and named pBS606. 
Functional Testing 

25 In order to determine the ftequency of lox recombination of isolated 

mutant Cre enzymes by the described selection procedure, it is necessary to 
separate the cre candidate expression plasmid (pBAD33) from the selection 
plasmid of the chosen Kan^ candidate. Then, the cleaned expression vector can 
be used to retransform the appropriate selection and screening strain BS1494, as 

30 well as BS1493 and BSl 541 to determine the candidate's capacity for /oxK2, 
teKl and /oxP recombinatioa under identical conditions. By comparing the 
resulting ficquencies of Kan*^ of different Cre mutants and wt Cre, all treated 
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identicaliVt one can determine quantitatively how well each chosen mutant 

5 

candidate really performs on the altered lox sites. 

Therefore, overnight cultures of candidates were grown in liquid LB 
supplemented with Cm and Kan for plasmid minipreps yielding a mixture of 
10 5 botfi the mutant Cie exprcsang plasmid pBAD33 and the newly Kan** selection 

plasmid. In order to eliminate the fatter, minipreps were digested with the 
restriction enzyme Aat II which only cuts the selection plasmid. After 
transformation of BS583 cells with this digestion mixture and approximately 2 
. h ere expression, different dilutions were grown on LB agar plates 
10 supplemented with Cm and X-gal to select for pBAD33. Plates with Ap plus 
Cm were used to determine the background of contamination with uncut 
20 selection plasmid. The next day, clones obtained by the Cm selection were 

tested for Ap^ and Kan* to confirm the elimination of the selection plasmid. A 
final overnight culture, followed by a minipiep procedure, yielded the unique 
15 plasmid for fimctional tesdng, as described above. 
Sequencing 

To obtain die DNA sequence of candidate ere genes in pBAD33, eight 
primers {BSB454 to BSB461, Table 3), four for each strand, were designed so 
30 that the entire gene could be sequenced in both directions. 

20 Site-Directed Mutagenesis 

After identifying one essential mutation for the decrease in substrate 
specificity, the Stratagene QuickChange^ ^ Site-Directed Mutagenesis Kit 
(Stratagene Cloning Systems. La Jolla, CA) was used to create ere mutants vwth 
mutations at the determined location, only. Using three different mutant primer 
25 sets (BSB 465 to BSB 470, Table 4), all steps were carried out as detailed in the 
manufacturer's instruction manual, except that electrocorapetent BS1494 cells 
were used for transformation and mutant selection, replacing the provided XLl 
blue cells. The made mutant candidates were subjected to functional testing and 
sequencing as detailed before. 
^5 30 In a different experiment, the DNA shuffling mutagenesis procedure was 

repeated on wt ere by adding one 5' phosphorylated strand of each set of mutant 
oligonucleotides (BSB465, BSB467 and BSB469) to the pool of small 
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fragments prior to reassembly. This allowed to incorporate them into the 
resulting ere pool. The desired mutations should consequently be introduced at 
much higher frequency than without the addition of oligonucleotides. 
RESULTS 

5 £stablis1iment of the Selection Procedure 

Selection Plasmids (pBS583 and pBS584) 

To test the generated plasmids pBS583 and pB584, a recombination 
event between their lox sites was mimicked by digesting pBS566 and pBS567 
(intermediates of the pBS583/584 construction, containing only the 3' lox site) 

10 withSfl/IandA?iaI,followedbyreli^tion. This lead to excision of the EGF? 
gene and terminators. After deletion oiEGFP and rmBTJz, the Kan*" 
phenotype was observed as anticipated. In addition, the frequency of 
spontaneously occurring Kan*' clones carrying the original plasmids was 
approximately 10 '. This badcground is inconsequential, since the 

15 transformation efficiency of BS583 cells was detennined as lO' per of 
pBAD33. 

The equivalent loxY control plasmid pBS613 was tested direcUy with the 
wt ere expression plasmid pBS606. After 2.5 h of ere expression, 94% of all 
clones were determined Kan'^ and about 6% showed blue color. Without ere 
20 expression, no Kan'^and no white colonies were observed. This confirms that 
the control cell line BS1541 (Table 2) permits the combined P* selection and F 
scieen. 

Screening Plasmids pBS601 and pBS602 
Only pBS602 was tested before use by expression of wt Crc and a 
25 K2^/P^ Cre mutant (see below). On non-selection medium, vrt Cre expression 
resulted in more than 95%, whereas expression of the mutant Cre resulted in 
less than 3% 6f blue colonies. This indicates that excision of the loxYa flanked 
fl6/A-st maricer by K2" Cre is possible. On selection medium, very few colonies 
could be found, since both types of cre have shown activity on toxP before (see 
30 below). 
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g Mutagenic vs, Non-Mutagenic PCR 

The &€quency of P' Cre mutanls obtained after non-rautagenic and after 
error-prone PCR was determined by the fi)llowiag experiment: After one 
mutagenic or one non-mutagenic PCR on the wi Cre expression plasmid 
10 5 pBSl 85, the resulting cre pools were inserted into the expression vector 

pBAD33 and transft)rmed into BS583 cells. After 2.5 h of arabinose-mediated 
/ induction or glucose-mediated repression (by SOC medium) or cre expression, 

(Ulutidns were transferred to LB plates with Ap, Cm, glucose and X-gal. The 
results are presented in Table 5: Under glucose repression, exclusively blue 
10 colonies could be identified (first line in Table 5), indicating that cre expression 
is insufficient for /oxP recombination and excision of lacZ of BS583. Induction 
20 witii L-(+)-arabinose, however, lead to the formation of white colonies at the 

presented frequencies (second line), indicating thst (i) the described control of 
cre expression by pBAD33 is ftmctioning. and fii) the mutagenic PCR 
1 5 conditions cause three times more impaired Cre enzymes for toxP 

recombination than the non-mutagenic conditions (60%) blue colonies vs. 
30%)- It is worth mentioning that ligation reactions lacking cre insertion 
resulted in 50 to 100 times less blue colonies than obtained with the ligations 
wiA cre insertion. This phenotypically blue background of empty pBAD33 was 
20 subtracted before calculating the presented data. 

Leung et al„ Technique, 1:1 1-15 (1989) reported that the frequency of 
pomt mutations created by error-prone PCR is about 0.5%. If this is true, in 
average five point mutations should occur in each 1 kb cre coding sequence 
subjected to an eiror-iMone PCR. By extrapolating this data to the three times 
25 less F enzymes after a non-mutagenic PCR, one can conclude that the 

^uency for point mutations should also be reduced by a factor of three to 
0.18%, Experiments made by Zhou et. al. (1991) showed 1 1% of a 633 bp 
marker gene phenotypically impaired after non-mutagenic PCR. About 37% of 
all genes in the pool, however, carried at least one point mutation. Even though 
30 the conditions for the non-mutagenic PCR were similar, the observed 

discrepancy between 1 1% and 20% of phenotypical mutants may be due to a 
variety of reasons, among which: (i) the size difference between the two genes 
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5 (633 bp vs. 1 020 bp), (ii) different elongation times during PCR, and (Hi) 

different sensibility of the two proteins for disabling point mutations. 
Testing wt Cre on loxKl^ vs. loxKX^ Substrates 
The level of /oxKl and loxKl recombination due to wt cre expression 
5 was determined using the wt Cre control plasmid pBS606. After transformation 
of the cell lines BS1493 and BS1494 with pBS606 and 2.5 h and 4 h of cr^? 
expression, cells were grown on selection and non-selection plates (as described 
previously). The recombination frequency between the altered loj^ sites was 
con^dered equal to the obs^ed firequracy of Kan*^ phenotype: for /oxKl, it 
1 0 was about 10'^ after 2.5 h and 2 X 1 0'* after 4 h of wt cre expression, for /oxK2, 
it chahged from about 2 X 10"* to 2 X 10'^ All colonies found were white, 
iiidicating effective loxP recombination by wt Cre within the allowed expression 
time. This result shows that long term expression of the wt enzyme perauts a 
slight increase in recombmation between the altered lox sites. The use of 
1 5 pB AD33 to avoid background lox recombination by suppressing cre expression 

25 

is therefore justified. Because /ojcK2 was eventually 100 fold better recognized 
by wt Cre than /axKl (but still at low frequency), first the creation of novel Cre 
recombinases with loxK2 specificity was attempted. 
30 Mutagenesis on laxK2 

20 First Rounds of Directed Evolution 

The result of the first four rounds of the described mutagenesis 
procedure on teK2 with selection plasmid pBS584 are presented in Table 6. 
The following symbols are used to describe the status of the DNA shuffling 
procedure for cre: **o** indicates a non-mutagenic PCR, '^m" a mutagenic PCR, 
25 and 'V stands for the in vitro reassortment event. For example, mxoxox cre 
40 represents a cre pool subjected to three rounds of the directed evolution process, 

with a mutagenic PCR followed by in vitro shuffling in the fu^, and non- 
mutagenic PCRs and shufQing as mutagenic and recombinogenic events in the 
following two rounds. The phenotypically blue background due to empty 

45 

30 pBAD33 was subtracted from all results by control ligations without cre 

insertion. In every roxmd, error-prone and non-mutagcnic PCR served as the 
necessary mutagenic event on the template pool of the previous round. After in 
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Vitro reassortmcnt and selection, always the larger Kan*^ population of the two 
parallel experiments was chosen as template for the next round (as indicated in 
the last column of Table 6). Only in the first round the error-prone PCR could 
lead to more candidates, whereas in all following rounds the reduced mutagenic 
5 frequency of the non-mutagenic PCR turned out to be more beneficiaL The 
density of point mutations resulting out of two mutagenic PCR's was obviously 
too high to allow efficient elimination of deleterious mutations from 
advantageous ones during the in vitro shuffling step. This is confirmed by the 
high frequencies of blue colonies found within any pool in any round subjected 
10 to mutagenic PCR's twice. Error-prone PCR in the context of the applied 
selection therefore appears to be usefid in the first round only, where its three 
times higher mutagenic frequency increases the amount of beneficial mutations 
compared to a non-mutagenic PCR. With increasing cycle numbers non- 
mutagenic PCR's should be preferred to avoid high densities of deleterious 
15 mutations. 

The established directed molecular evolution process allows effective 
evolution of ere. Column five of Table 6 shows that with ever>' round the 
number of Kan*^ colonies is increasing, while the time for ere expression could 
be lowered from 4 h to 2.5 h (column three). After only three rounds, Cre 
20 mutants capable for /oxK2 recombination at decreased concentrations due to the 
reduced expression time were found. However, it was not possible to isolate 
any blue colony on the selection plates. All identified Kl" mutants therefore are 
also P^ As mentioned above, high densities of deleterious mutations in the ere 
pool subjected to enor-prone PCR's twice could explain why no blue colonies 
25 were seen on selection plates, even with over 90% of P" candidates on non- 
selection plates. 

Evaluation of Six K2*/P* cre Mutaots 
Functional Test 

As indicated in Table 6, 36 white Kan^ colonies could be isolated from 
30 tiie mxoxox ere pool after only 2.5 h of cre expression with an 1/5 dilution 
grown on selection plates. This result indicated that competent Cre mutant 
capable of /oxK2 recombination were produced. Six were selected for fiirthcr 
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analysis. After elimination of the selection plasmid &om the minipreps (as 
described in Materials and Methods), all six of them, as well as wt Cre 
(pBS606) were subjected to the described functional test on loxK2, but also on 
/axKl and lox? recombination with plasmids pBS584, pBS583 and pBS613. 
5 The results are presented in Table 7. 

Briefly, by selecting for loxKl recombination, all mutants except 
mxoxox 4, showed significantly increased percentages of Kan*^ (between 3% 
and nearly 70%), compared to wt Cre (0,002%), as indicated in column three of 
Table 7. This indicates a 10^ to over 10* fold increase in activity on loxJO.. 
1 0 On lox?, all (including mxoxox 4) showed recombination fitrquencies 

between 80% and 100% after 2.5 h of cre expression (column 4). This was 
expected from the results obtained with the X-gal screen for lox? recombination 
during tiie selection procedure. 2.5 h of induction for wt cre expression is 
therefore sufficient for ahnost complete lox? recombination, justifying 2.5 h of 
15 expression of mutant cre pools for selecting competent K2^ Cre mutants. The 
observed slight decrease in lox? recombination with the mutants mxoxox 3 to 6 
either derived from usual variations during experiments, or may indicate a 
slightly reduced lax? activity. With BS 1541 blue colonies on both selection 
and non-selection plates were found m ^proxunately the same frequency (2% 
20 to 20%) as kanamycin sensitivity (Kan^). This indicates competition between 
the lox?^[lacZ] site on the genome and the lox?\EGFP'rrnBT 1T2] sites on 
pBS61 3. Since the same Cre mutants never resulted in blue colonies during 
selection for loxK2 recombmation in cell line BS1494, it is possible to conclude 
diat lox? is still prefenred over /ojcK2. This argument is supported by higher 
25 frequencies of lox? recombination, close to 100%, compared to the loxKl 
recombination frequencies of 3% to 70%. 

All frequencies found for teKl recombination, deteraiined by using cell 
line BS1493, lie below 0.01% after 2.5 h, as well as after 4 h of induction of cre 
expression (column five). This indicates that no analyzed mutant developed an 
30 increased activity on lox Kl compared to wt Cre. 
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To summarize, five out of the six analyzed mutants showed a significant 

5 

decrease in specificity, resulting in the possibility for loxV and loxK2 
recombination. 

Sequencing 

10 5 The six described mutants have been completely sequenced in both 

directions to determine the mutations which lead to the observed decrease in 
specificity. The resulting aligned ere coding and aa sequences of all mutants 
and wt Cre are represented in Tables 10 and 11. Each mutant showed between 
3 and 8 point mutations, altogether 3 1 , as listed in coliunn two of Table 8. The 
10 overall mutagenic frequency can therefore be calculated at 0,5% (3 1 mutations 
in 6 clones of 1020 bp), which is similar to only one round of error-prone PGR 
20 (Leung et al.. Technique, 1:1 1-15 (1989)). The reason for the low frequency of 

point mutations after three rounds of the mutagenesis procedure O-c nine 
PCR's) is the applied selection after each round, cre mutants with low density of 
15 point mutations seem to be favored by the stringent kanamycin selection. 

26 of the 31 identified point mutations resulted in aa changes compared 
to the wt sequence, as indicated in column three. No deletions or frame-shift 
mutations, as well as no codons affected by more than one point mutation at the 
30 same time could be identified. All possible transition events could be observed, 

20 but only half of all possible transversion events (shown in Table 12), Ademne 
to guanine and vice versa transition events represented almost 30% of all 
identified pomt mutations. All other events occurred less frequent (10%, 7%, or 
never). The represented statistic may however be biased, either since only six 
mutants wfere analyzed, or due to the directed molecular evolution technique 
25 itself: The types of point mutations observed less often, may more frequently 
be deleterious in cre and were consequently removed from the pools. With 
more mutants to be sequenced, this question could be addressed further. 

Only one point mutation, a transition event from ademne to guanine at 
position 785 in tiie cre coding sequence is common for all 5 mutants witii 
30 remarkably increased loxKl activity. This mutation leads to a replacement of a 
glutamate residue at position 262 in die J helix of wt Cre by a glycine (indicated 
in column five of Table 8). This glutamate is believed to contact the lax? 
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sequence at positions 1 1 or 12 with its acidic side chain (Guo et al.. Nature, 
389:40-46 (1997)). Another point mutation, resulting in a conservative 
threonine to serine exchange at position 316 was identified in three mutants. 
Five point mutations were found independently in two of the six mutants, 

5 among which two types of silent mutations. FinaUy, eleven point mutations 
occurred only once (indicated in column four). Therefore, the critical mutation 
for loxK2 activity appears to be E262G. Some of the additional mutations could 
be responsible for the observed ten fold difference in loxia recombination 
frequency among the five E262G carrying mutants. 

10 Site-Directed Mutagenesis to Verify Results 

Site-Directed Mutagenesis Procedures 

To determine whether the E262G mutation alone is responsible for the 
increase of /o;cK2 activity of the Cre mutants by at least a factor of lO' . two 
different experiments were made: 
15 First, the described directed evolution procedure was repeated in three 

different sets on wt ere, by adding tiiree 5' phosphorylated mutant 
oligonucleotides prior to reassembly: Incorporation of the first oligonucleotide 
^SB465) into cre should lead to the E262G mutation, incorporation of the 
second one (BSB467) to a E262A mutation, and the equimolar nuxlure of 
20 random oligonucleotides {represented by BSB469) to 20^ (= 8000) possible aa 
combinations at positions 261 to 263 of Cre. According to Stemmer, W. P. C, 
Proc, NatL Acad ScL USA, 91:10747-10751 (1994), tiiese oligonucleotides 
should be incorporated during reassembly and cause the desired mutations at a 
frequency of about 8%. After insertion of the resulting cre pools into pBAD33 
25 and 2.5 h of expression, 0.8% of white Kan* colonies were found wth BSB465. 
Therefore, the frequency of the Kan* phenotype due to loxK2 recombination is 
ten times lower than the expected frequency for the E262G mutation to occur. 
This indicates that the E262G mutation increases the specificity for /axK2 from 
0.002% of wt Cre (sec above) to approximately 10% of recombination during 
30 flie standard expression experiment (2.5 h of cre expression prior to selection). 
By using the second oligonucleotide (BSB467) only 0.09% of the total white 
colomcs showed Kan*. Thus, E262A still favors lox K2 recombination but by 
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g almost a factor ten less efficiently than the E262G mutation. With the random 

oligonucleotide mixture (BSB469), the frequency of loxK2 recombination 
shrunk to 0.02%. Compared to a control experiment with no oligonucleotides 
added, lesulting in 0.003% Kan*^ (consistent with the results obtained below), 

10 5 still sonie of the possible 8000 aa combinations at positions 261 to 263 of Cre 

are expected to fevor loxKl recombination. Since the frequency of blue 
colonies on non-selection plates was about 30%» a clear indication for additional 
mutations, it cannot be completely excluded that some of the occurred Kan*^ 

1$ 

mutants of this experiment have been carrying some additional benefidal 
1 0 mutations. Nevertheless, this experiment indicated that the E262G mutation is 
probably the basis for significant increase in loxKl activity. 
20 For better defined mutations, a second experiment was done: The same 

oligonucleotides, now in sets for both DNA strands, were used with the 
Sttatagene QuckChange^. Site-Directed Mutagenesis Kit as described in 
15 Materials and Methods. After transformation of the /oxK2 selection strain 
BS1494, the percentage of white Kan^ colonies in the three different 
experiments could be determined as 6.2% for E262G, 0.8% for E262A and 
0.6% for the 8000 different ten (10) days combmations as positions 261 to 263. 
30 Due to the QuickChange™ procedure, whidi eliminates parental DNA, one 

20 could expect that the desired mutations occurred in at least 50% of all clones of 
the three different pools. This assumption indicates that the E262G mutation 
alone restilts in approximately 12% or less recombination frequency on loxK2 
(6% out of 50% or more carrying the mutation) and the E262 A exchange in 
1.6% or less (under standard conditions). This calculation is not valid for the 
25 third set of oligonucleoticks, because no defined mutation is introduced but 
40 rather a mixture of 8000 different ones. Blue colonies were not found during 

this experiment, confirming that the frequency of additional mutations altering 
Cre activity on lox? was voy low. 

The results of both experiments indicate conastendy, that the E262G 
30 mutation alone is sufficient to increase IoxK2 recognition by approximately a 
factor of lO' compared to wt Cre. The ten times higher frequency observed 
with three of the six analyzed mutants after three rounds of the mutagenesis 
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procedure can only be expMoed wiA additional beneficial mutations. The 
E262A tnmation increases the frequency of /oxK2 recombination also, but 
approximately by a factor often less effectively than E262a 

Sequencing and Eliiiiinatioii of Possible Additional MatatioM 
5 Two while Kan* coloiaes of ead> pool derived from the site-directed 

mutagenesis procedure were selected and the entire ere gene was sequenced. 
Both E262G and E262A candidates showed the desired sequence with no 
additional mutations. The random candidates surprised: One of them did not 
reveal any point mutation, representing an artifact which managed to survive the 

10 selection, whereastheothcroneshowednucleotide alterations from position 
783 to 786. The wt sequence (n.CTG GAA,«i) was found to be changed to (™3 
CTT TGGtoa), resulting in one silent mutation, conserving U61 . and a E262W 
exchange. To exclude possible mutations in pBAD33 due to the PCR involved 
site-directed mutagenesis procedure, the three defined ere mutants (E2620. 

15 E262A.andE262W)weieexcisedbyHi«/niandJft«Iandreinsertedintothe 

MCAoffteshpBAD33. 

Functional Testing 

The three defined Cre mutants for the amino acid poation 262 were 
subjected toa&ncdonal test for /arK2, toKl and lox? recombination activity, 
20 as mentioned before. TTie results are summarized m Table 9. First, the results 
described previously for the E262G and E262A mutations were confirmed: As 
indicated in column three, the toK2 recombination fiequency increased 10* 
fold wifli Uie E262G mutant compared to the wt enzyme, whereas the E262A 
mutant siiows only an increase of 200 fold. Surprisingly, tite E262W mutant 
25 also achieved a similar activity on /oxK2 as seen with the E262G mutant The 
test for Jox? recombination frequency witii cell Ime BS1541 (column four) 
showed that the ability for teP recognition is at best slightiy impaired by tiie 
three different mutations (as already seen with the analyzed mxoxox mutants). 
Agmn, blue coloines could only be found with this ccU line, indicating tiiat laa 
30 is preferred over /<«K2 (as described before). As expected, none of the three 
mutants performed significantly better on teKl than wt Cre (column five). To 
conclude, tiiis final experiment provides the necessary evidence that the E262G 
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g mutation presents the basis for the observed decrease in specificity of Cre. 

However, additional mutations seem to be helpful to increase the newly 
obtained activity further. In addition, glycine at ix)sition 262 is not the only 
possible residue to permit a remarkable increase in laxK2 activity. 

10 5 DISCUSSION 

ia*/P* Cre Mutants 

The chosen random mutagenesis procedure linked to the described 
selection in £ coU allowed the identification of Ore mutants, characterized by a 
wider substrate recognition. The evolved enzymes ^owed wt-likc activity on 
1 0 lox? sites and in addition had almost the same activity on altered ica? sites, 

referred to as loxKl, By contrast, the wt enzyme showed only marginal activity 
20 on /oxK2^ substrates. 

lox Sites 

lox? and loxK2 differ at several locations as illustrated in Figure 1: 
1 5 First, the three outermost bp of the inverted repeats are altered, facilitating the 
construction of tiie various plasmids used for selection. Second, the entire non- 
canonical 8 bp spacer is completely exchanged, and third, two transversion 
events (thymine to adenine) are introduced at positions 1 1 and 12 of the lox site. 
30 Only the two transversion events are considered important for inhibiting wt Cre 

20 from recognizii^ the site. Other investigations have ^own before, that the two 
alterations of /oxP mentioned first are without inhibitiag effect on wt Cre (B. 
Sauer, unpublished results). The design of teK2 was based in part on so-called 
cryptic lox sites, that were identified previously in the yeast genome (Sauer, B., 
1 MoL Biol, 233:911-928 (1992)), Another consideration was the choice of a 
25 good starting sequence for the described mutagenesis procedure. Starting with 
^ sites that contained several and/or widespread alterations of loxV was avoided, 

because the greater the number of alterations in the substrate, the more the 
enzyme would have to be altered. Consequently, to most effectively use the 
mutagenesis procedure, the two sites presented in Figure 1 , laxTLX and loxYJ., 
30 were designed to have only two critical alterations. In initial experiments, wt 
Cre was found to recombine /oxK2^ substrate pBS584 slightly belter than 
/oDcKl^ pBS583. This difference may depend on the fact that in /oxK2 the two 



35 



50 



74 



55 



wooo/6oa9i Pcr/usoo/a9i54 

alternations are located next to each other, while in toKl they are separated by 
3 bp (positions 14 and 10). Thus, loxK\ could interfere with wt Cre binding at 
two distinct DN A-protein interaction sites as compared to loxKl, where only 
one location of incompatibility is available. For this reason, /oxK2 was chosen 
5 for the initial set of experiments, described in this work. 

Cre Mutants after Three Rounds of Directed Evolution 
Three iterations of the in vitro evolution procedure were necessary to 
identify 36 candidates, expressing Cre mutants that could process loxK2 {based 
on the applied selection procedure in E. coli). Tests showed that five out of six 
10 selected ones had lO' and 10^ fold increased activity on loxK2 v/hcn compared 
to wt Cre. On lox? and teKl, however, there was almost no difference 
between wt and the mutant enzymes. The mutants therefore had developed an 
increased tolerance from transversions at positions 1 1 and 12 {hxK2) of the lox 
sequence, bm not for other positions like 10 and 14 (/oxKl). To in^catc this 
1 5 phenotype they were referred to as K2*/P*. 
The£262G Mutation 

Sequence analysis identified that the five mutants with remarkably 
enhanced loxKl activity (10^ to 10* fold compared to wt Cre) had in common 
only one point mutation, leading to the aa change E262G, Site-directed 
20 mutagenesis experiments confirmed, that the E262G mutation is sufficient to 
increase loxK2 activity by a factor of lO' compared to wt Ore. 

Based on the recently described crystal structure (Guo et al.. Nature, 
389:40-46 (1997)) glutamate at position 262, located in the J hehx of the 
enzyme, may be a DNA contacting residue and pennit the formation of a 
25 hydrogen bond between the caiboxyl group of its side chain and an arruno groi^ 
of one of the two adenines at positions 1 T or 12' in the hx? sequence (Figure 
I). However, changing these two bases to thymines in the loxK2 sequence 
diould lead to an electrostatic repulsion between their oxygens and the acidic 
ade chadn of glutamate. This could explain the observation that wt Cre is 
30 unable to catalyze a recombination between two /oxK2 sites. Exchanging 
glutamate for a glycine, that does not have a side chain, should remove the 
electrostatic repulsion and thereby permit /oxK2 binding. On the other hand, 
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this alteration could affect lox? binding because an electropositive DN A-protein 
interaction may be lost Results from the mentioned experiments support this 
proposition: The E262G mutation alone lead to an increase in loxKl 
recombination &om 0.002% to about 2% and to a slight decrease in hx? 
5 recombination from 94% to about 80%. The 50 times lower frequency of /oxK2 
compared to hx? recombination may depend on the second th^onine or more 
likely on the complementary adenine (position 1 1 in toxK2) that could 
contribute to sterical repulsion between the J helix and the loxK2 site. To prove 
this hypothesis, the role of arginine at position 258, located one helix turn away 
10 from the glutamate. should be further investigated by site-directed mutagenesis. 
As proposed by Guo ct al.. Nature, 389:40-46 (1997), R258 is a DNA 
contacting residue that forms hydrogen bonds with the guanine at position 10' 
of lox? and may also interact with the bp at position 1 1 . There is yet no 
confirmatory experimental evidence for this ptoposal. 
15 Results from three initially isolated mutants (mxoxox 1 to 3) indicated 

about 50% of recombination frequency on loxKl. This is about ten fold higher 
than that obtained with the E262G mutation alone. It is therefore likely that 
some of the additional pomt mutations identified in these tiuree mutants account 
for this mcrease in activity. Table 8 lists all point mutations that were found. If 
20 silent and conservative mutations are considered not to influence specificity, 
only a limited number of candidates to account for the phenotype remains. 
Among these, S254G and Q255R of mxoxox 2 and 3. because of tiieir location 
close to the amino-terminus of the J helix, could be expected to influence DNA 
contacts vwth positions 1 1 or 12 of tiie lax site. The other mutations are 
25 scattered in tiie Nr and C-terminal domain of Cre. All, except RIOIQ of 

mxoxox 5, affect aa tiiat are not located within tiie proximity of DNA contacting 
areas. Some appear independentiy in two mutants, e.g. D29A or D189N, that 
could influence protein folding or the interactions among the four Cre enzymes 
necessary for recombination. Such alterations could influence for example the 
30 orientation of the J helix and tiiereby reduce remaining interference between die 
loxKl site and tiie enzyme. Alternatively, some mutations, also silent ones, 
could influence protein expression, leading to a faster accumulation of en2ymcs 
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g and consequently to higher recombination frequency. This possibility should 

however also influence lox? recombination. In fact, after 2.5 h of ere 
expression, the nautants mxoxox 1 and 2 showed a slightly higher frequency of 
lox? Tecombination compared to wi ere (98% vs. 94%), This difFcrence, on the 
5 odier hand, may be attributed to the variations that occur nonnally within 
experiments. To address this question ftuther, shorter ere txpTtssion times on 
lox? would be required. 

When the coordinates of the crystal structure (Guo et al.. Nature, 

15 

389:40-46 (1997)) are av^lable (protein data bank, Brookhaven National 
10 Laboratoiy), it will be possible to confirm many of the tenets discussed. 

Finally, the increase in IoxK2 tolerance between the E262G mutation 
20 alone and the isolated mutants carrying additional point mutations justifies the 

use of the DNA shuffling procedure linked to selection: Not only has it 
permitted the eiinunation of deleterious mutations fiom the sequence pool, but it 
1 5 helped to accumulate variotis more or less beneficial aa alterations as well 
The E262A and E262W Mutations 
The mentioned site-directed mutagenesis procedure was used to generate 
two more defined Cre mutants, E262A and E262W. Compared to the E262G 
30 mutatiSon, E262A permitted loxK2 recombination ten fold less effectively. The 

20 E262W mutation however resulted in similar activity on /axK2. 

The aliphatic side chain (a methyl group) of A262 could be the reason 
for slight sterical interference. This would explain the observed reduced 
frequency of /oxK2 activity with E262A. /oxP recognition, however, could not 
be found to be affected compared to £2620. The lowered hxK2 activity 
25 explains why no E262A mutation was identified in the small pool of six 
40 analyzed mutants: With a ten fold decrease in activity, one would expect to 

encounter the cotresponding mutation ten times less often during selections as 
well. 

In contrast to the small side chain of alanine, the large aromatic side 
30 chain of W262, was expected to inhibit tecombination between any lox sites due 
to sterical interference. Surprisingly, this seems not to be true. A possible 
explanation for the observed activity on lox? and /0JcK2 could be that the 
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aromatic and planar structure of the tryptophan side chain, fits better into the J 
helix - lox interface than does a methyl group. Different influences of A262 
and W262 on folding of the J helix could also contribute to the observed 
phenotypes. The reason why the E262W mutation was not identified among the 
six analyzed mutants is the genetic code. Whereas for the E262G and E262A 
mutations only one bp in the glutamate encoding GA A codon needs to be 
mutated to GGA or GCA, for E262W the whole codon must be exchanged to 
TGG. This is unlikely to occur during the applied random mutagenesis 
procedure with a mutation frequency of 0.5%. Oflicr amino acid changes due to 
two or three muUtions of the glutamate encoding codon therefore cannot be 
considered to have occurred during the random mutagenesis.procedure. 

When the coordinates of the crystal stmcture are public, it will be 
interesting to confirm and further investigate the discussed hypothesis. 
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p^j^ Polymerase Chain Reaction 

Shine-Dalgamo sequence 

single stranded 
Yjr thymidine kinase 

5 wt >^l<l^yPe 

5.biomo-4-chloro-3-indolyl-p-D-galactopyranDSide 

Example 2: Analysis of Variant Cre Recombinases 

This example describes analysis the activity of several specific variant 
10 Cre recombinases. 

Vectors 

pBS606. 614. 626, 627. 628 and 650: pBAD33 with vrt, E262G. 
E262G/D29A, E262G/D189N. E262GyT316S. and R3M3 cre insertion used for 
expression ofthe corresponding Cre proteins inDHSa for /« vivo testing. 

15 pBS632 topBS64l: pUC19 based plasmids for In vtw tests of different 

Ore mutants on a variety of different /« sites and combination of lax sites, all 
bearing theFASlspacer(SauerB.,A'«c/.,c^cf^to..24:4608-46l3(19m 
Recombination between two hx sites leads to excision of a neo cassette to gtve 
tanamycinsensitive£. coli. The same plasmids wen: also used for ,« v.7ro 

20 recombination experiments. 

pkH200: wt Cre expression plasmid (a generous gift from Ron Hoess. 
DuPont. Wihnington. DE) used to ove,«cpress wt Cre in BU1(DE3) (Novagen. 

Madison, WI) strain. 

pBS654 ,opBS658: vncre of pRH200 was replaced with different c« 
25 mutants(E262G.E262G/D29A.E262G/D189N.E262Gmi6S.a„dR3M3) 

using Age 1 and Mlfi I restriction sites. 
£ coli Strains 

BS583: The £. coli strain BS583. DH5 Alac (XD69 loxP'UacZ LEU2]), 
was chosen as the bacterial backeround for the selection procedure using 
30 plasmidpBS584. Due to the teZ-^t/acZ] containing X prophage. Qc activity on 
iojcP can be evaluated singly "^ns X-gal plates. 
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BSJ494: The £ colt strain for selection was established by introducing 
the selection plasmid pBS584 into BS583. Thus, BS1494 allows a kanamycin- 
selection for hxK2 and in parallel a blue/white-screen for hxP recombination 
with 5-bromo-4chlOTO-3-iadoIyl-p-D-galactopyranoside (X-gal). Note that the 
5 spacer region of the loxK2 site (FASl) is different from the original one of loxP, 
Thus, recombination events between loxF of the \ prophage and loxK2 of the 
selection plasmtd pBS584 catalyzed by potent Cre mutants are excluded in 
BS1494. 

BS1376to BSISSI: For the m vivo recombination e3q)criments wt and 
10 mutant Cic expressing strains were generated by introducing plasmids pBS606. 
614, 626, 627, 628. and 650 into DH5a. 
Transformation of coH 

For plasmid transformations oiE. coli strains, electroporation was 
preferred over chemical protocols. Electrocompetent cells were made and used 
15 for electroporation as described by Smith et al. Focus, 12:38-40 (1990). The 
^propriate cell porator and cuvettes were from Life Technologies (Bediesda, 
MD). 

Site-'Directed Mutagenesis 

The QuickChange^ Site-Directed Mutagenesis Kit (Stratagene Cloning 
20 Systems, La Jolla, CA) (1997) was used to generate defined single and double 
mutations in the cre gene. 

Overcxpression and Purification of Candidate Mutants 
Wild-type and five different mutant Cre proteins were overexpressed 
using plasmids pRH200, and pBS654 to pBS658 in conjunction with Novagen 
25 B121 (DE3) cells. After induction for 2.5 h cells were harvested, sonicated, and 
Cre partially purified after DNase I digest with a single step Whatman® P 1 1 
lesin (Whatman Inc., Fairfield, NJ) as described before by Wierzbicki et al., J. 
Mol Biol, 195:785-794 (1987). The obtained Cre preps were about 80% pure 
and protein concratrations ranged between 100 and 200 ng/fil- 
30 Mutant Analysis 

In Kivo: Plasmids pBS632 to pBS64l were transformed into Cre- 
cxpicssing£: coli strains BS1576 (wt), BS1577 (E262G), BS1578 
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(E262G/D29A), BS1579 (E262G/D189N),BS1 580 (£2620^168), and 
BS1581 (R3M3). After I hour of induction of ere expression v.ith 0.2% L-(+> 
arabinosc, 10** dilutions were plated on non-selection medium containing 0.2% 
D-glucose. After overnight incubation at 37»C, colonies were uansfotmed to 
5 kanamycin-selcction plates for the described negative selection for neo excision. 

!n Vitro: Purified Cre mutants were used for both in vitro recombination 
and gel retardation experiments as described before by Sauer B., Nucleic Acids 
Res,, 24:4608-4613 (1996) and Wierzbicki et al., J. Mol Biol., 195:785-794 
(1987)» respectively. For the recombination reactions plasmids pBS632 to 
10 pBS641 served as substrates, whereas for the DNA binding reactions 7["P]- 
dATP (Amcrsham Pharmacia Biotech. Piscataway, NJ) end-labeled 35 bp 
oligonucleotides were used, each encoding a te-halfsite and one half of the 
FASl spacer. 

Evaluation of Six K2*/P* cre Mutants 
15 Six of the 36 identified angle Kan*^ colonies of the third round were 

chosen for fiirther analysis and referred to as R3M1 to 6 (Round 3 Mutants 1 to 
6). Retesting them in the indicator strain revealed that all but one (R3M4) show 
significant ioxK2 recombination and all are unbiased in their activity on loxP 
(Table 14). Sequencing analysis revealed one amino acid change conunon to all 
20 five mutants having mcreased loxK2 activit>': a glutamate to glycine exchange 
at position 262 (E262G) in the J helix of the Cre protein (Figure 1 3). A second 
point mutation, a conservative threonine to serine exchange at position 316 
(T316S). was identified in three of the mutants with enhanced loxK2 activity. 
Two non^onservative mutations (D29A and D189N) were found in two of the 
25 five mutants. In addition, ten mutations occuned only once. Therefore, the 
critical mutation for loxK2 activity appears to be E262G. 
Site-Directed Mutagenesis 

To address the question of the influence of the different point mutations 
further, the QuickChange™ Site-Directed Mutagenesis Kit (Stratagene) was 
30 applied to generate the foUovring Cre mutants, each confirmed by sequencing: 
E262G. E262G/D29A. E262G/D189N, and E262G/T316S. 
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!n Characterization 

^ To elucidate the contribuiion of specific amino acid changes in 

conferring altered recombinational specificity to Cre, recombination assays with 
differem lox sites were carried out with the following Cre enzymes: the wt 
10 5 enzyme, 6ne of the originaUy sequenced third round mutants (R3M3), and the 

generated single and double mutants. All lox sites used for the in vivo tests 
were designed to have the same 8 bp spacer region (FAS 1) so that 
recombinational specificity was completely dependent on Ore's recognition of 
the symmetrical inverted repeats of the lox sites. Note that wt Cre-dcpendent 
recombination between loxP sites bearing the FASl spacer does not differ from 
recombination between original loxP sites (Sauer B., Nucleic Acids Res., 
20 24:4608-4613(1996)). 

Figure 14 presents the recombination frequencies of various mutant lox^ 
substrates and combination of sites (loxP with loxK2 and as control with loxKl) 
15 from the marker excision assay. Mutant lox sites with symm^c nucleotide 
25 substitutions at positions 11 and 12 of the loxP sequence were tested with the wt 

enzyme and the five variant Cre mutants, including the multiple mutant 
R3M3(A). All enzymes showed a maximum in recombination (close to 100%) 
witii thymines at tiiese positions, i.e. with loxP". Adenines, i.e. /ox/C2^. lowered 
20 the recombination frequencies drastically for die wt enzyme, whereas tiie smgle 
and double mutants performed approximately 50% to 70% less effectively. 
R3M3, however, showed nearly Lox?-like activity on the loxKf substrate, as 
35 seen before witii the selection strain (Table 14). Altering die two thymines to 

guanines resulted in similar recombination frequencies as seen with adenines at 
25 these positions. Cytosines, on the other hand, did not result in similar 

recombination frequencies as seen with thymines, but were surprisingly the 
least efficientiy recognized substitutions of positions 11 and 12. To conclude, it 
appears tiiat tiie E262G mutation is necessary and sufficiaxt to sigmficantiy 
increase recombination frequencies on lox sequences which arc symmetrically 
45 30 altered at positions 11 and 12. Of die additional mutations teste4 D29A seems 

to be slightiy beneficial, whereas D189N and T316S appear indifferent or even 
slightly deleterious for recombination on die variant sites. Thus, additional 

50 82 



30 



40 



55 



I>CT/USOI)/09IS4 

WO 00/60091 

mutations of R3M3 (Figuitj 13) must be responsible for its further increased 
performance on the mutant substrates in vivo. 

Figure 14B shows the observed recombination frequencies on mixed 
substrates (e.g. loxP with loxK2). For both hxK2 and the centre! substrate. 
5 teW, recombmationvrith ioxPby wt enzyme was substantially lessthan for 
laxP' recombination. TWs recombination frequency was increased dramatically 
with all of the mutant Cre protein. These results hint that not only a 
cooperativity in binding of two Cre molecules to one hx site exists but 
moreover also cooperativity between Cre molecules binding to different sites 
10 which then arc synapsed and recombined. TTiis finding is especially useful for 
genomic targeting: h suggests that a targeting vector carrying a loxP site will 
be effectively recombined with the endogenous /«-like site by the Cre mutants 
as long as the spacers are compatible. 

Noteworthy also is R3M3-Cre's increased ability to iwsombine hxKI by 
15 itself. whereasallthemuta«ts,likewt.didnot. Again, some additional 
mutations found in R3M3 (Figure 1 3) seem to be responsible not only for 
increased recombination frequencies on IoxK2 but also on loxKl compared to 
Ore bearing the E262G mutation alone or in conjunction vrith the D29A 
mutatioru 

20 In Vitro Characterization 

In Vitro Recombination Asscys. For the in vitro recombination 
experiments the same substrates as in v/vo-that is. pBS632 to pBS641-were 
used. An in vitro recombination experiment using wt, E262G, and R3M3 Cre 
preps on loxP'- IoxK/, and loxKl' substrate showed that, as seen /» vivo before. 
25 wt Cie is capable to recombine loxP" substrates only. No recombination 
pnxlucts are visible when hxKZ' or hxKl' substrates were used. By 
introducing the E262G mutation into Cre, however, recombination of fox^a^ 
substrates becomes possible at elevated frequencies, and even for loxKl' 
substrate recombination products become weakly visible in vitro. E262G^re 
30 activity onthetoiiC/ control site invirro but not wvm, probably derives from 
differences in ionic strength and/or enzyme concentration between the assays. 
rmaUy. with R3M3 Cre the abiUty for both loxK^ and laxKl' recombination is 
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fimher increased, as expected from the in vivo assays. As memioned in the in 
vivo results before, guanines at positions 1 1 and 12 of the lox sequence were 
recognized at similar frequencies as seen with adenines (i.e. loxK2), whereas 
cytosines were clearly less tolerated by all Crc mutants. Slight differences 
5 between in vivo and m vitro recombination frequencies are probably due to 
differences in ionic strength, temperature, DNA condensation, enzyme 
concentration, etc. In general, the pattern of the in vitro recombination 
frequencies of the different Cre enzjmes on the different lox sites mirrors the 
one seen in vivo, 

10 Gelshift Experiments. Gclshift experiments were applied to address the 

question of w vitro DNA-affinities of the different Cre mutants. As expected 
from the previous results, all three Cre enzymes bind with similar efficiency to 
loxPy whereas to loxK2, only E262G and R3M3 show binding affinity. 
Surprisingly, R3M3 binding appears less efficient on the loxK2 half-site than 
15 biniting of E262G. and on the loxKl half-site only E262G does bind weakly 
whereas R3M3 does not. 
DISCUSSION 
lox sites 

laxP and loxK2 differ at several locations as illustrated in Figure 1 : 
20 First, the three outermost bp of the inverted repeats are altered (positions 15, 16, 
17, 15', 16' and 17'). Second, the entire non-canonical 8 bp spacwis 
completely exchanged (positions 4 to 4'), and third, two transveision events 
(thymine to adenine) are introduced at positions 11 and 12 (and mirrored at I T 
and 12') of fte lox site, mimicking potential recombination tai^cts in 
25 eukaryotes. Only the two mutations at positions H and 1 2 are considered 

important for inhibiting wt Cre from recognizing the site. Thus, they were the 
only alterations in the loxi sites used for the in vivo and in vitro characterization 
experiments. Other investigations have shown that the two alterations ofloxP 
mentioned first arc without inhibiting effect on wt Cre (Sauer, B., MoL Celt. 
30 Biol, 7:2087-2096 (1987) and Sauer B., Nucleic Acids Res., 24:4608-4613 
(1996))- Noteworthy is however that the altered 8 bp spacer region (FASl 
spacer) does not allow loxP - loxK2 recombination, since the regions where the 
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single-sttand cleavages and exchanges take place aie not compatible. To allow 
simultaneous monitoring of Cre-mediated recombination both at the vrt loxP" 
and at a mutant lox' substrate (loxKf, incompatible spacer elements were used 
to prevent recombination between the two types of te sites by a candidate Cie 

5 mutant with altered specificity. Such recombination might easHy compionaise 
ready detection of desired Cie specificity mutants. Incompatible spacers 
(original loxP and FASl) formed the basis for the simultaneous selection for 
loxK2 recombination and screen for loxP recombination with E. coli strain 
BS1494 which led to the disclosed variant Cre recombinasBS. 

10 £ox/C/. the other lox sequence used in this study, bears two critical bp 

exchanges per arm as well, however at different positions: 10 and 14. It was 
used as a control fox site, addressing the question whether tiie generated Cre 
mutants with novel specificity for loxK2 can also tolerate adjacent alterations 
within the lox sequence. 

15 Cre Mutants after Three Rounds of Directed EvolntioB 

Three iteratiwis of the in vitro evolution procedure were used to identify 
36 candidates which express Cre mutants that could process loxK2 (based on tiie 
selection), as well as loxP (based on tiie simultaneous screen), wt Cre. on the 
otiier hand, only shows marginal activity on loxK2 when expressed at very Wgh 

20 levels. 

CharaetetizatioD 

Sequence analysis revealed that five mutants witii significant loocK2 
activity had in common only one point mutation, leading to the annuo acid 
change E262G. However, sevaal other pomt mutations occurred independently 

25 twiceorthrice.among*hichD29A.D189N.andT316S. To mvestigate the 

influence of (he mentioned mutations on tiie observed phenotype, specific single 
and double mutants were generated by sitenlirected mutagenesis. In vivo and in 
vitro assays were tiien carried out witii vrt-Cre and five different nnitants 
(R3M3, E262G. E262Gfl>29A. E262G/D189N. E262Gmi6S) to compare 

30 theirpHformanceonavarietyofdiffetentalterationsofflicto^/asite. 

The in vivo and in vitro recombination assays showed a similar pattern 
in lecombination ftequendes for tiie different enzymes on tiie different sites 
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tested. In gei>cral. recombination frequeiKies on mutant substmes were 
distinctively the highest with R3M3. The double mutant E262Q/D29A was 
about half as effective as R3M3, whereas the other double mutants and the 
single mutant E262G showed slightly ftirlher decreased recombination 

5 frequencies on the altered sites. The wt enzyme did not recombine any of the 
mutant substrates presented here, neither in vivo nor in vitro. With previous 
««alts showing that single D29A, D189N, and T316S mutants of Cre perform 
like the wt enzyme on loxK2 and loxKl in vivo and the fact that the E262G 
point-mutation was the only one found independenUy in all the originaUy 

10 isolatedCremutantswith/«xi:2specificity,itisclearthatE262Gisacritical 

mutation that allows Cre to lecogjiize loxK2. However, in combination with 
E262G, D29A perinits stiU higher recombination frequencies on lox sites altered 
at positions 1 1 and 12. Since R3M3 shows even further increased 
recombination frequencies on the altered sites compared to E2fi2G/D29A, other 
15 ofthepointmutationsidentifiedinthisQemutanKseeFigufe 14) must account 
for this increase in activity. Because of its location dose to the amino-termimis 
of the J helix, the Q255R mutation of R3M3 could be expected to influence 
DNA contacts. Other mutations may influence protein foltoig or protein- 
protein interactions which could result in a higher flexibility within the Cre-/(W 
20 interface and thus allowing a better tolerance of alterations of the to sequence. 
TWs hypothesis is also supported by the observation that R3M3 recognizes the 
loxKl site at frequencies similar to E262G recognizing loxKI. The double and 
single Cre mutants, on the other hand, did not show activity on loxKl. In 
addition, the gel-shifl experiments showed that R3M3-Cre's binding afBnity for 
25 loxK2 and loxKt half-sites is less than E262G-Cte's and the three double 

mutant's. Taking these results together, otiier mutations of R3M3 must further 
influence Cre-foi interactions to allow enhanced recombination on loxKl. On 
the one hand, this results in less efficient binding to to half-sites, on the other 
hand, when complete to sites are available, the cooperativity phenomenon 
30 bctweenCreenzymesbindingtothesameanddifferenitositsmay 

compensate forthisloss in binding activity. nKntheposmlaled mcteased 
flexibility between DNA and protein seems to become advantageous for 
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^zi„eai«iiecon.bminBal««dtox sites, as seen with the 

vitro tecombination assays. 

Ahematively, some muutions. also silent ones, ciuld influence pmtein 
expression,leading to afi^ter accumulationof enzymes andconsequentiy to 

5 higherrccombinationfiequency. An£ coft codon usage taMe suggested, 
hovvever.thatnoneoftheidentif.ed muUtions should improve C« expression 

in E coli remarkably. 
Modeling 

With the published crystal structtite of four Cre molecules bound to tw 
synapsed loxA sites after the first single strand cleavage (Guo et al. Nature. 
389:40-46 (1997)) the identified point mutations were analyzed for bemg 
involved in DNA and/or protein interactions. All of them, including the E262 
position in the J-heUxofCre were found to be not involved in eitherinteractums 

inthisstateofthereaction. Hiis observation indicates that the mutations which 
v«re found to account for the described novel substrate recognition In v.«, and 
« vitro lead to this new phenotype in a less direa and obvious mamier. As 
mentioned earlier, they may influence protein folding, resuWng in ahigher 
flexibility within the Cre.to interface. This hypothesis is especially well 
supported with the desaibed differences between binding totohalfeites ami 

^combination. Alternatively, rbcy may still be involved in protein-protem 
(D29A) or protein-DNA (E262G) contacts before or after die formation of the 
clanv-like strand-exchange state. 
35 In Vivo and In Vitro Recombinatioii 

Someadditionalvariationsoftheto#C2siteweretestedalso. Thelistin 
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Some aooraonai vanauwia wi. 

Table 17 shows aU tiie sequences of lax sites which were tested and assigns 
tiiemanamcaswell as theirplasmid(pBS) number. InFigure 15 ti« obtained 
i„ V.-VO recombination frequencies on all tire variants of loxK2 are indicated. 
The additional results indicate thai alterations at position 12 of ti« te hal&te 
are of more importance for Cre-based recombination than ones at position 1 1 

The in vitro recombination frequencies of all six Cre enzymes tested on 
the tox sites listed above are given in tablelS. The frequencies were calculated 
after quantitation ofthe brightness of fluorescence of IheEftidium-Bromule- 
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Stained DNA fragments on agarose-gels. Differences in temperature, ionic 
strength, medium composition, and enzyme concentration probably account for 
the observed differences between in vivo and in vitro recombination results. 
Most strikingly, loxP is no longer recognized with the highest frequencies in 
5 vitro. However, when the ionic strength in the in vitro assay was increased 
results began to resemble the ones seen in vivo. Thus, efilciency of 
recombination with variant lox sites by each of the Cre mutant and the wt 
enzyme can be further controlled in vitro by adjusting ionic strength and other 
in vitro conditions. 
10 Qualitatively, however, the in vivo and in vitro recombination 

fiequencies mirror each other. These novel Cre mutants thus possess a 
2^ specificity for substrates {loxK2 and its derivatives) which are not recognized by 

the wt enzyme. 

Gelshift Assays 

2^ 1 5 DNA binding (gelshift) experiments were also done with the generated 

Cre double-mutants (E262G with D29A. D189N, or T3 16S) to analyzing 
binding of the variant Cre recombinases to various recombination sites. In table 
16 the observed mean percentages of binding to loxP, hxK2, and loxKl 
30 halfsites with the five different Cre mutants and the wt enzyme are given. As 

20 shown, all mutants - in contrast to the wt enzyme - do bind to loxK2 with 

similar frequencies, except R3M3 whidi shows surprismgly low retardation. As 
discussed above, this phenomenon may be explained with an increased 
tolerance, i.e., flexibility, of R3M3 for ahercd lox sites. On halfsites which 
precludes the cooperativity between Cre molecules in binding R3M3 cannot 
25 bind as tight as wt or the single and double mutants. \^th the hxKl hal&ite 
40 this 'bindmg versus recombination' diffmnce is even more strikingly. 

Wheitas wt and the R3M3 mutant cannot bind to the halfsite, all the other 
mutants can. Yet, recombination of loxKl substrates was seen with R3M3 Cre, 
only. These results show clearly that simple DNA affinity does not conelate in 
30 a one-to-one fashion with recombination. Thus, inappropriate DNA bincUxig by 
recomlnnases likely can lead to a block in recombmation. 
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15 



Conclusions The in vivo data sliow that: 
1 . The E262G mutation confers a generally elevated level of recombination 
at a number of variant /ox sites, those having any of a large number of 
alterations at positions 1 1 and 12 (and mirroring 1 T and 12' alterations). 
5 2. The D 1 89N mutation in conjunction with the E262G mutation appears 
to fine tune the broadened specificit\' of the E262G mutation by reducing 
recombination at the loxK2 variants *GG*, *TC' and XC vi^ithout decreasing 
recombination at loxK2 and the *GT* variant This mutation is thus useful to 
limit the broadened specificity of E262G. 
10 3 . The T3 1 6S mutation v^hen in conjunction with E262G provides a slight 
boost in recombination frequency with the IoxK2 variants *GT* and 'CC, and 
2^ has no deleterious effect on recombination at the other variant loxK2 sites. 

4. The D29A mutation together with E262G boosts recombination at loxK2 
and the variants 'CC* and 'GG*. D29A does not reduce recombination at the 

1 5 other variant lox sites or at loxP. 

25 

5, Some of the additional mutations of R3M3 must account for the further 
increased recombination frequency on any of the tested mutant lox sites, 
including loxKJ but do not compromise loxP recognition. 

30 The disclosed variant recombinases have a number of useful features 

20 and applications. By recognizing an altered, user-defined target site, they were 
designed to allow both genetic targeting events in prokaryotes and eukaryotes 
like wt Cre but on different sites and in vitro recombination strategies. With a 
vwder variety of possible target sequences being now accessible, multiple and 
defined genomic alterations will now become feasible. This opens more 
25 possibilities in designing genomic manipulations in all DNA-based organisms 
40 by site-specific recombination. 

With the various genome projects under way today, there will be more 
and more applications for site-specific recombination to study the impact of 
genes or genetic control elements by genomic engineering. In additioi^ genome 

45 

30 manipulations are also more frequentiy used to express recombinant protems 
within the organism of choice. 



35 



50 



89 



55 



wo 00/60091 



PCT/USOO/09154 



Table 13. Presented are tlie results of the first four rounds of the 
described mutagenesis procedure for loxK2 specificity. Column one indicates 
the round of the mutagenesis procedure (round 0 indicates that no mutagenesis 
event has taken place yet, wl enzyme, only) and column two the allotted time 
5 for ere expression in induction medium. Column three presents the observed 
frequencies of Kan\ indicating hxK2 recombination by a mutant candidate. Of 
note, all Kan^ colonies found during tlus experiment were white, indicating 
effective loxP recombination. In column four the actual number of Kan*^ 
colonies found in each round of 1000 to 10^ plated is given, and the last column 
1 0 presents the frequency of white colonies on non-selecdon plates which is 
decreasing with every round. 

Table 14. Presented are the frequencies of Kan^ of sbc chosen candidate 
mutants from round three of the mutagenesis procedure (R3M1 to 6) when 
retesting them in the indicator strain. Also, they were tested for their 
1 5 performance on laxP sites in identical fashion. The obtained Kan*^ frequencies 
indicate the percentages of recombination within the allotted induction time of 
2.5 hours. 
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Table 2: List of Strains Used forSeteciion and Screen 



BS# 


Descrtpiion 


Substraies 


Selection and Screen 


1493 


BS583 [pBS5831 


toxKiyioxV 


neo selection and lacZ screen, respectively 


1494 


BS583 IpBS584) 


ioxKl'J loxP- 


neo selection and lacZ screen, respectively 


1541 


BS583 [pBS6l31 


lox?^ihx?- 


neo selection and lacZ. screen, respectively 


1523 


NS2300 [pBS601] 




abf A.si screen and neo selection, respectively 


1524 


NS2300 lpBS6021 


loxKlVlox?^ 


abf A.si screen and tieo selection, respectively 



75 List of the £. coU strains used for the described seleaion and screening procedures for (he desired Oe mutants. 

BS1493 and BSM94 served as selcciton strains for Kl* or K2* Crc mutants, respectively. BSt54l was used as a 
conlroi strain to determine iox? activity. BS1523 and BS1S24 were used to select for P* mutants. Construction 
and application of the different selection procedures arc explained in detail in the lexi. 
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Table 3: Sequencing Primers for ere 



Primer BSB# 


SequcQCC 5' - 3* 


Position (5% 3') in ere 


454 


TTT GGG CTA GCG AAT TCG AG 


-55. -36 


455 


TTT GGG CCA GOT AAA CAT GC 


273. 292 


45<) 


CGG TGG GAG AAT GTT AAT CC 


567, 586 


457 


GO A CAC AGT GCG CGT GTC • 


862, 879 


458 


TCr GCG TTC TGA TTT AAT CTG 


1117. 1097 


459 


CCA GGG C AG GTA TCT CTG 


858, 841 


460 


GTA CGT GAG ATA TCT TTA ACC C 


563.542 


461 


TTG CTG GAT AGT TTT TAG TGC C 


270, 249 



Presenlfid are the eight sequencing primeis used for sequencing of the entire ere gene in both directions and their 
positions within the ere coding sequence. Primers BSB 454 to BSB4S7 allow sequencing of the coding strand, 
whereas the four remaining primers have been designed for the non*coding strand. In order to achieve a complete 
sequence infonnaiion in both dironions. about 350 bases have to be read out of each sequencing reacuon. The 
positions given for each primer refer to its 5' and 3* end. widi posilion 1 referring to Uie adenine of the ATG start 
codonofcre. 



Table 4: Primers Used in the QuickChange"^" Site-Directed Mutagenesis Kit (Stratagene) 



mm 








GCTATCAACTXXK:GCCCTCk}CAGGGArnTrGAA 


765, 809 


466 


GAGTTOCTTCAAAAATCCCTCXXAGGGCGCXSAGTTGATAGCTGGC 


805, 761 


46:7- 


GCTATCAACTCGOGCCCTGOCAGCGATnTrcAAGCAACrCATO 


765, 809 




GAGTnXnTCAAAAATCCCTGCCAGGGCGCGAGTTOATAGC^ 


805,761 




GCTATCAACrCGCGCCNNNNNNNNNATTTTTGAAGCAACTCATCG 


765, 809 


470 


GACTTGCTrCAAAAATNNNNNNNNNGGCGCGAGTTGATAGCTGGC 


805. 761 



niustrated are the mutant oUgonudeotides used in the QutckChange^ Site-Directed Mutagenesis Kit (Stratagene) 
to introduce single point mutations (mismatches with the wt sequence arc highliglned by bold letters). With 
BSB465/466 a E262G mutation, and with BSB467/468 a E262A mutation is introduced in Crc. The last set of 
primers (BSB469/470). as indicated by the l)old N, represent an cquimolar mixture of all possible bases at the 
assigned positions. This mixture results in all possible aa combinations at positions 261 to 263 of Oe. As 
mentioned before, the indicated positions refer to S* and 3' end of the oligonucleotides in the ere coding sequence. 
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Tables: Comparison of Mutagenic and Non-Mutagenic PGR on wl ere 



genie ^ RjGK 
White - ' 



> 99.6% 



15:7%" 



< 0.4% 



81.3% 



> 99.5% 



60.8% 



< 0.5% 



39.2% 



(SOQ.«imluc,ion of c«exp«ssion(SOB+ara).U« second column compares ^'^J^^ff * ™^ 
«Ki white colooies after a oo«-mul.genic PGR, *c *ird column after a ^'^f "l^'^^^* 
to the to^ flanked lacZ gene of BS583 is no. excised by Oe. wh,.e ones .u ^^■'"^^f' "^^"^;J^, ^ 
*e number of blue colonics resuKing from empty pBAD33 vector '^^"""'^^J^l^ T^ZZ^ 
subtracted before calculating the presented values. TTiis blue badtgwxmd v»as deterauned wuh a control ligation 

increasing three limes under mutagenic condiuons. 
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Table 6: The Firsi Four Rounds on loxK2' 



Round 



Status 



laduction 



mxoxoxox 
mxoxoxinx 



4 h 



.4 h 
2.5 h 



4 h 



2.5 h 



4 h 



2.5 h 



4 h 



2.5 h 



4 h 



% White 
(Non'Selccfion) 



83.9% 



42.4* 



88.3% 



82,5% 



13% _ 
24% 



80.7% 



85% 



25.7% 



21.8% 



75% 



2 5 h 1 <7^ 



% Kan* 
(Wliite) _ 



0.01% 



O.J 6% 



0.02% 



0.56% 



<o.gi% 



0.3% 



0.2% 



25% 



0.02% 



5% 



4.6% 



0.3% 



Nb. Kan" 
(White) 



47 



36 
10^ 



250 



»02 



Nb. Pooled for 
lUc Next Round 



47 



30 



mxox oxinx ^-J " | ^ «.» i ■ ■ ,„.-;r.^;,„ th^ 

following symbols for Ihe suitus '^.f?„f^'^^;'Xs" I d!ges, and reassembly. Column one indicates 
muugenlc PCR. and "x" siands for .he shuffling «ep by DNasc I ^ ^^^^^ ^ 

*e«H»<lof fte Biulagenesis procedure. colum» 7° ^^laied p^^^^ 
penntaed forcr* expression in induction medrum. In column four <he«^^^^ £ 

S^ineleaio. plates is indicated, decreas-ng w.th '^'^ .«J,W.«.o. 
„„^ge«ielCR-s..i«.Co,umnfivep^^U;c^^^^^^^^^ 

by » mutanl candidate. Of note, that all Kan colonies louno outi B »" - -Qio-ies found ia each 

Effective (oxP recombination. The two last columns g.ve -he « "^""""^jf ^ '"^ 
round (W5 dilution) aiid mark the ones used as surling maienrf for the v^l round. 
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5 



10 



15 



20 



25 



30 



Cre 
Candidate 


Induction 
Time 


BS1494 {/oxK2') 


% ^Kan*^ witu 

BS1541 {lax?') 1 
93.9% 


BS1493 aoxKi») 
0.001% 


wt 


2.5 h 
4h 


0.002% 
n.d. 


n.d.. 
98.2% 


0,002% 
0.004% 


mxoxox 1 


2.5 h 
4h 


66.4% 
n.d. 


n.d, 
98% 


0.005% 
0.007% 


^mxQxox 2 


2.5 h 
4h 1 


53.6% 
n.d. 


n.d. 

80.5% 


0.005% 
0.007% 




2.5 h 
4h 


66.5% 
n.d. 


n.d. 
80.3% 


0.01% 
0.008% 




2.5 h 
4h 


0.05% 
n.d. 


n.d. 
81.2% 


0.006% 
0.006% 




2.5 h 
4h 


15-2% 
n.d. 

1 3.1% 


n.d. 
83.7% 


0.009% 
0.003% 




2.5 h 
r 4h 


Q.d. 


n.d. 


0.01% 



^X^on. The .h«. followins columos show *e o.cula.ed ^"^,^^''^^1^ 
di««S.l Oe mu.ants «nd w, Cre .o U.C Arec selection sU3.ns. m ft«,uc»cy of *e ofcao«l ^P^^e « 
™,drf«rd K an indicator for the frequency of lox recombination: On hxKX. all muune. except raroxox i. 

similar resulis i wi Cre. indicaliag. ihat /^xP rccombinauoa .s at best slightly affected, '«K1 r«U^ 
luat nor wt Cr. show more than background activUy. even after four ^l^l^H^^^' ^ ^^"^ 
mutants are therefore chanicterized by a broader substrate cccognmon compared to the wt enzyme. 
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Table 8: Sequence Analysis of the Six mxoxox K2*/P' Muwms 



10 



15 



20 



25 



30 



35 



40 



45 



Crc 


Codon 


Cliangc in 


ere 


aa Chan£C i 


Nb. of 1 


Position in 


Candidate 




wt 


>■ null 




Isolates 1 


Cre 


tHKOXOX I 


45 


CCG 


CCI 


(PI5P) 1 


2 j 


(N'icrminal) 




86 


GAC 


ccc 


D29A 1 


2 1 


A 




565 


QAC 


AAC 




2 1 


loop 1 • 2 




642 


AGC 


AGI 


(S2I4S)* 1 


i 1 


(N lcmi. of 1) 




679 


ere 


ATC 


V227I 1 


1 1 


I 
J 




785 


CM 


GQA 


E262G* 1 


5 1 


mxoicox 2 


45 


CCG 


CCI 


(PI5P) 


2 I 


(N-icftninal) 


413 


CM 


CfiA 


EI38G 1 


1 1 


N-»erm. of F 




760 


AC3C 


fiGC 


S254G 


1 1 


loop 5 - J 




785 j 


CM 


GQA 


E262G* 1 


5 


J 




861 j 


Td 


TCC 


(S287S*) 


2 


N-icrm. of K 




946 


&pc 


ICC 


T3I6S 


3 1 


loop L - M 


mxoxox 3 


Aft 


QTC 


ATC 


VI6I 


1 


N-fcnninal 




565 


QAC 


AAC 


D189N { 


2 


loop 1 - 2 




592 


QGC 


AGC 


GI98S 


I 


C-tcrm. of 2 




668 


CGA 


CAA 


R223Q 


1 


i 




764 


CAG 


CQG 


Q255R 


1 


loopS^J 




785 


GAA 


GQA 


E262G* 


5 


J 




86 i 


TCT 


TCC 


(S287S*) 


2 


1 N-term. of K 




920 


CCG 


CIG 


P307L 


1 


L 


mxoxox 4 


230 


TAT 


TQT 


Y77C 


1 1 


1 ^ 


412 


QAA 


AAA 


E138K 


1 1 


1 N-term. of F 




851 


CIG 


CAG 


L284Q 


1 2 


1 loopJ-K 


mxoxox 5 


86 


GAG 


GCC 


D29A 


I 2 


A 




302 


CQG 


CAG 


RIGIQ* 


1 ^ 


1 ^ 




659 


CIG 


CAG 


L220Q 


1 i 


1 ^ 




785 


GM 


GGA 


E262G* 


1 ^ 


I J 




946 


ACC 


ICC 


T316S 


1 3 


1 loop L - M 


mxoxox 6 


785 


CM 


GQA 


E262G* 


3 


1 J 




851 


CTG 


CAG 


L284Q 


j 2 


1 loop J - K 




946 


ACC 


ICC 


T31CS 


1 3 


1 loop L - M 



The point muUiions ideniified by sequence analysis in ihc six sclcacd K27P^ 

idemifics tlw moiams. column Iwo ihe observed poinl mulalions and me.r position m ihe cre coding sequence. 
Column ihwe indicates the rcsuUing aa changes and iheir posiiions m the Cre ^-^V^-J^^^-^' "^"^^"^"^ •» 
parx^rithesis). DN A contacting residue., according to .he crystal structure (Guo, al 1997). a«. marked by un 
asterisk. Column four indicates how often the di( Icrcni point mutations xvere UKlependently found .n the pool ol 
,he six mutants, and column five shows where the aa changes are located in ihc secondary stnictun: of tlw proie.n 
Of the altogether .31 point ,«uiai,ons listed, f.ve are silenL Only one aa change. E262G. .s common to all 
mutants with remurkahlyiitaea,sed/mK2 activity (m.o.ox 5. and 6), suggesi.ng thai ih.s mutalion 

the essential «.»e for the observed phenoty,^. Nine pu.m mutations occulted iw.ce or three tHih:s m the pool, 
xvhereas eleven were found in one muiam only. Por lunlier information, sec text and. appendices A, B. and C. 
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20 



Table 9: Functional Evaluation of Thn^ Defined Qt Muianls for Posuion 262 and wt Cre 



Cre Mutanl 



wt 



E262G 



E262A 



and E262W alone are sufficient 10 remaricaWy increase KAt s aciivuy •« 
increase loxKi recogniiion, and ihai tox? activity may be slightly affected. 



Induction 
Time 



2.5 h 



2.5 h 



TTtr 



2.5 h 



BS1494 (toxK2^) 



0.002% 



2.2% 



0.4% 



2.9% 



% Kan*" with 
BS1541 jlox?^) 



93.9% 



84.2% 



82.3% 



81.7% 



PS 1493 (toxKl') 



0.001% 



0.004% 
0.006% 



6.003% 
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In Vitro Rv^tution of tl ig Cre rccombinase 



wiCrt 

nixox.oxl 

mxoxox2 

rnxoxoxS 

inxoxox4 

mxoxoxS 

fnxoxox6 

-55 



TntXSOCTAG 
TrrcCGCTAG 
TrrOGOCTAG 
•nTCGCCTAG 
TrrCGGCTAG 
IliOCXXTAC 
TnOOGCTAG 



CGAATTCCAC 
CXjAATTOCAG 
CGAATTCCAC 
OGAATTOGAG 
CGAATTCCAC 
OGAATTOGAG 
CCAATTOGAC 



CrCOGTAOCC GGGGATQCTC 

CrOOCTAOCC GGGGATCCrc 

CrCOGTAOCC COOGATtCTC 

CTOOGTACCC GOOG ATOCTC 

CTCGGTACCC OOOGATOCTC 

CTCGGTACCC OGGCATOCTC 

CTCGCTAOGC GGGGATCCTC 



TAGACTGAGT 
TAGACrCAGT 
TACAGPGAGT 
TAGACICAOr 
TAGACIGAOT 
TACACTOAGT 
TAGACICAGT 



wtCie 

mxoxoxl 

mxoxox2 

mxoxox3 

nixoxox4 

mxoxoxS 

mx 0X0x6 

-5 



GTGAAr.TTTC 
GTGAA^-TTC 
CTGAA--,TrrC 
GTGAA.--7rC 
GTGAAA.t-yrC 
GTCAAATTrC 
GTGAAAT-yrc 



-wfCre 
mxoxoxl 

raxoxojcl' 



*.;iiixoxo>i4; 




gtcgatocaa 
gtcgatocaa 
gtcgatocaa 
atggatccaa 
ctcdgatocaa 
gtcgatcx:aa 
gtcgatocaa 



i ggatogocag 
] ggatogcxag 

l^^x^^ GGATOGOCAG 
m^MXOxM GGATQGCCAO 
.^VVx|3|: GGATOGOCAG 
-•ihxbxoxSi? GGATOGGCAG 
:mxoxox6: GGATOGOCAG 

96 



CAATTTACrG 

caatttactg 

CAATTTACrc 
CAATTTACTC 
CAATTTACIG 
CAATTTACTG 
CAATTTACTC 



OGACTTGATGA 
OGAOrGATCA 
0GAGTC5ATCA 
OGAGIGATGA 
OGAGIGATGA 
OGAGIGATGA 
OGAGTSATGA 



GOGl 1 1 lUG 
OCGTnTCrG 
GOGrmCTG 
GO O 'l lUUG 

GoarmoG 

GCXiUllUO 

GCGrnrcTG 



accgtacacc 
aoogtacaoc 
acogtacaoc 

ACCGTACAOC 
AOCCTACACC 
AOCGTACAGC 
ACCGTACACC 



CGTTOGCAAG 
CCnTOGCAAG 
OGTIOGCAAG 
GCrroOCAAG 

GGmccx:^ 

GUI IUjCAAG 
OGITOaCAAG 



AAAATTTCCC 

AAAATrrocc 

AAAATTTOOC 
AAAAtnOOC 
AAAATTIGOC 
AAAATTTOOC 
AAAATTIGOC 



AAOnGATGG 
AAOCIGATGG 
AA0CIGA1GG 
AAOCIGATGG 
AAOCTGATGG 
AAOCTGATGG 
AAOCIGATCG 



AOCATAOCrc 
AOCATAGCTG 
AOCATAOCTG 
AGCATAOCTG 
AGCATAOCTG 
AGCATAOCTG 
AOCATAGCTG 



GAAAATOCTT 
GAAAATOCTT 
GAAAATOCTT 
GAAAATOCrr 
GAAAATOCTF 
GAAAATOCTT 
GAAAATOCTT 



TGCATTAOOG 
TQCATTACCT 
TGCATTAOCr 
TOCATTAOaC 
TXATTAOOO 
TGCATTAOOG 
TGCATTAOCG 



ACATGrrCAG 
OCATGTrCAO 
ACATCnCAG 
ACATGnCAG 
ACATCnCAC 
GCATtnTCAG 
ACATGTICAG 



CIGTOOGrTT 
CIXjICOGnT 

cixjTCxxmr 

CTGTOOGnT 
CTGrcOGTTT 

crcrrooGrrr 

CTGTOOGTTr 



■ wlOe 
mxoxoxl 

. mxoxoxl 
mxoxox3 
fnxoxox4 
mxoxoxi 
inxoxox6 

146 



GOOGGTOgrG 
GOCGGTOGTG 
GCOOGTCXnC 
OCOQCrOGTG 
GCOOGTOGrG 
GOCOGTCXJIXj 
GOCGGTOGTG 



GGCOGCATGG 
GOCGGCATOG 
GGOGGCATOG 
GGOGGCATGG 
GOOOOCATGG 
GGOGGCATOG 
GOOOOCATGG 



TCCAAGTTCA 
TOCAAGfTCA 
TGCAAGTIGA 
TGCAACnCA 
•tGCAAGTTCA 
TGCAAGTIGA 
TGCAAGTIGA 



ATAAOGGGAA 
ATAAOOGGAA 
ATAAOGGGAA 
ATAAOOGGAA 
ATAAOGGGAA 
ATAAOOGGAA 
ATAAOOGGAA 



ATGcrmocc 
ATGcnrocc 
ATG crrrocc 

ATGGTnxXC 
ATOCTTTCOC 

ATOGTrrcoc 

ATOGTITQCC 



wtCre 

mxoxoxl 

mxoxaxZ 

mxoxoxB 

mxoxox4 

mxoxoxS 

inxQxox6 

196 



gcagaaocig 
ccagaacctg 

GCAGAACCrc 
GCACAACCTG 
GCAGAACCrc 
GCAGAAOCrC 
GCAGAAOCTO 



AAGATGTTCG 

AACATGTTDC 
AAGATCTTCG 
AAGATGTTOG 
AaGaTCTTTOG 

aagatgttcg 

AAGATCTTCG 



OGATTATCTT 
CGATTATCTT 
CGATTATCrr 

ocattatot 

CGATTATCrr 
CGaTTATOT 
CGATTATCrr 



CTATATCrrc 
CTATATCTTC 
CTATATCrrc 
crATATCTTC 
CTATGTCrrc 
CfATATCrrC 
CTATATCTTC 



AGGOOOOOGC 
AGGCOCGOGG 
AOCOOOQOGG 
AOGCCOGOOG 

aggcgcgogg 
aggogcocgg 
agoogcgcgg 
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yji f^ Rvoli 'li^ h IT'' recombioase 



wiOc 

rnxoxoxl 

vxoxox2 

nixoxoxl 

mxoxcxA 

mxoxox5 

inxoxox6 

246 



TCrcWCACTA 
TOCGCACTA 

tctcocacta 
tctggcacta 

TCTGOCACnrA 
TCTCOCACTA 
TCIOaCACTA 



AAAACTATCC 
AAAAOATCC 
AAAACTATCC 
AAAACTATCC 
AAAACTATCC 
AAAACTATCC 
AAAACTATCC 



AOCAACATrr GGOCCAGCTA AACATOCTIC 

AGCAACATTT GGCOCAOCrA AACATCCTTC 

AOCAACATTT CXXXTAGCTA AACATCCTTC 

AGCAACATTT OOCXXIAOCrA AACATCCTTC 

AGCAACATTT OtXKXAOCrA AACATCCTTC 

AGCAACATTT QGGCCAGCrA AACATOCTIC 

AGCAACA-nr GOOOCACcrA aacatoctic 



wiCrc 
mxoxoxi 
mxoxoxl 
roxoxox3 
mxoxox4 
mxoxoxS 
inxoxdx6 

296 

•.wlCre 
••tuxoxoxI. > 

^mX0X0x3 r;. 



ATCCnOGGTC 
ATajTCGGTC 
ATCSCrcOGTC 
ATOGTCGCrc 
ATOGIOCGTC 
ATQGTCAGTC 
ATOGTOOCIC 




gttatccgcc 
crrATCXXGC 

GltATOOOGC 
CnATOCOOC 

crrATOCOGc 

GnATOOOGC 
GTTATOOQGC 



CGGOCTGCCA 
CGGGCrcCCA 
CGGGCrOCCA 
CGGGCraOCA 

cGOGcnocrA 

CGGGCroCCA 
OCiOGCroCCA 



GGATOOGAAA 
GGATOOGAAA 
GGATOOGAAA 
GGATOOGAAA 
GGATOOGAAA 
GGATOOGAAA 
GGATOOGAAA 



r'-oixoxoxo"* 

396 



..wtCre ■ 
' .mxoxoxi > 
'-.mxoxoxi 
" mxdxox3 
raxoxox4 
mxoxoxS 
mxoxox6 

446 



ACAGOCTCTA 
ACAGGCTCTA 
ACAQGCTCTA 
ACAGGCTCTA 
ACAGGCTCTA 
ACAGOCTCTA 
ACAGOCTCTA 



TOGAAAATAG 
TOGAAAATAG 
TOGAAAATAG 
TtlGAAAATAG 
TOGAAAATAG 
TOGAAAATAG 
TOGAAAATAG 



wtCie 

mxoxoxi 

mxoxox2 

mxoxoxi 

mxoxox4 

mxoxoxS 

fnxoxox6 

496 



ccaocaagtc 
ccaocaagtc 

OGACCAACrG 
GGAGCAAGTO 
OGAGCAAGTC 
CCAOCAAGTC 
OGAQCAAGTC 



AGAAAAOGTT 
AGAAAAOCrr 
AGAAAAOGTT 
AGAAAAOCHT 
ACAAAACGTT 
AGAAAAOGTT 
AGAAAAOjTT 



GOCrrOGAAC 
GCGTTOOAAC 
GCCnCOGAC 

GCGTICGAAC 
GCGTTCAAAC 
OOGITOGAAC 
GOGTTOGAAC 



CGATCOCIGC 
CGATCGCTGC 
CGATOGCrcC 
CGATOOCrOC 
CGATOOCTGC 
CGATCGCTGC 
GGATCOCrcC 



ATIGCnrATA 
ATTCCrrATA 
ATTGCITATA 
ATTGCTTAIA 
ATTCCrTATA 
ATTCXTTATA 
ATTOCTTATA 



ACAGCAATCC 
ACAGCAATGC 
ACAGCAATCC 
ACAOCAATOC 
ACAGCAATCC 
ACAGCAATGC 
ACAGCAATGC 



GATOCOGGTC 
GATGOOGCtTG 
GATOOOOCrtG 

GATixxxxnc 

GATOOOGGTG 
GATOOOOCTTO 
GATOOOOarG 



TCTnCACTG 

TcrrrcAcro 
tgtitcactg 

TCTTTCACTG 
TUTTTCACrG 
TCTTTCACTG 
TGTTTCACTG 



AAOGTCCAAA 
AAOGIGCAAA 
AAOGTOCAAA 

AAOCTOCAAA 
AACGIGCAAA 
AAOGPGCAAA 
AAOGTOCAAA 



GCACTGATIT 
OCACraATTT 
CCACTGATTr 

CCACTGATrT 

ocAcraATTr 

GCACTCATTT 
GCACTCATTr 



CAGGATATAC 
CAGGATATAC 
CAGGATATAC 
CAGGATATAC 
CAGGATATAC 
CAGGATATAC 
CAGGATATAC 



OGAOCAGGTT 
CGACCAGGTT 
OGACCAGGTT 
OGAOCAGGTT 
CGAOCAGGTT 

ocAOCAGcrr 

CGAOCAOGTT 



GTAATCTGOC 
GTAAtCTOGC 
GTAATCTCGC 
GTAATXTOX 
GTAATCTCGC 
GTAATCtGOC 
GTAATCTCGC 



ccncAcrcA 

CGTICACrCA 
OCnCACTCA 
CCnCACTCA 
CCTTCACrCA 
CGTTCACrcA 

ccncAcrcA 



AT I ICIGGOQ 
ATTTCrOQGG 
A'lllUGQGC 
ATnCTGGGG 
ATnCTOGOG 
ATnCTOGGG 
ATnCTOGGC 



ACAOO CTGTT 
ACAGCCTCTT 
ACACC CTOTT 
ACACCCTCTT 
ACACCCTGTT 
ACACC CTCHT 

ACACCcrcrrr 



AOCTATAGCC 
AOGTATAOCC 
AOCTATAGCC 
AOGTATAGCC 
AOCTTATAGCC 
AOCTATAGCC 
ACXJTaTAGCC 



GAAATTGOCA 
GAAATTGCCA 
CAAATTOOCA 
CAAATTOOCA 
CAAATTOOCA 
GAAATTGOCA 
GAAATrCCCA 



GGATCAGGGT 
GGATCAOGCn' 
GGATCAGGGT 
GGATCAGGGT 
CCiATCAGGCT 
GGATCAGGGT 
GGATCAGGGT 
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MftCre 

RIKOXOX I 

mxoj(Ox2 
cnxoxox} 
mxoxox4 
mxoioxS 
nixoxox6 

546 



TAAAGATATC 
TAAAGATATC 
TAAAGATATC 
TAAAGATATC 
TAAAGATATC 
TAAAGATATC 
TAAAGATATC 



TCAOCTACnC 
TCAOGTACTA 
TCAOGTACTG 
TCAOGTACTA 
TCAOGTACro 
TCAOCTACIC 
TCAOOTACTG 



AGQGrCOGAO 

aoggtoggag 

AOCXrrCGGAG 

aooctcggag 

AOOCraOGAG 
ACXJGICOCAG 
ACGOrOOGAG 



AATGTTAATC 
AATUTTAATC 
AATCTTAATC 
AATCTTAATC 
AATCTTAATC 
AATCTTAATC 
AATCTTAATC 



CATATIXSXA 
CATATIOGCA 

CATAxrocx^ 

CATATTAOCA 
CATATTOGCA 
CATATTOOCA 
CATATIOOCA 



wiCrc 

mxoxoxl 

mxoxox? 

01X0X0x3 

mxoxox4 

mxoxoxS 

inxoxox6 

596 



OAACGAAAAC 
GAAOCAAAAC 
GAACGAAAAC 
GAAOGAAAAC 
GAAOCAAAAC 
GAAOGAAAAC 
GAAOGAAAAC 



GCTOdTAGC 
GCnXHTAGC 
GCTOCntAGC 
GCrcGTTAGC 
GCTGCTTAGC 
GCIGCrrAGC 
G CI tXil lAOC 



AGOCCAGGTC 
AOXX^GGrc 
AOOGCAOGTC 
AOOOCAGGTG 
AOOOCAQGTG 
AOGOCACGTC 
AOOCCAGCro 



TAGAGAAGGC 
TAGAGAAGGC 
TAGAGAAGGC 
TAGAGAAGGC 
TAGAGAAGGC 
TAGAGAMXX: 
TAGAGAAGGC 



acttaoocpg 
acttagicic 

ACTTAOOCrc 
ACrTACOCTG 
ACrrAOOCTG 
ACTTAOGCTG 
ACTTAOQCTG 



vrtCsc CGGCTAACTA AACTCGTCGA 

mxoxoxl CGGCTAACTA AACTGCTCGA 

:mxoxox2 CGGCTAACTA AA CIOGTOGA 

mxbxox3 CGOGTAACTA AACTOGTOGA 

SjAxo«x4 GGOCTAACTA AAClUilOOA 

^miofoiS.: GGOCTAACTA AACAGGTOGA 

SiS^Sw^6J GGOCTAACTA AACrcCTOGA 



GCXJATGGATT TCOCTCICTC GTUTAGCIGA 

GOGATCGATT TCCATCTCTG GTGTA GCIGA 

GOGATCGATT TOOCTCICTG CroTAGCIGA 

OCAATQGATT TaXTTCTCTG GICTACCIGA 

OCGATOGATT TGCXTICICTG fncrAGCIGA 

OOOATOOATT TOXnClCTG CTUTA OaGA 

GOQATOGATT TOGGnCICTO GTCJrAGClGA 



p^Qrc^^' TGATOOGAAT 
l^tiioW^' TGATOOGAAT 
l^pxggf' TGATOOGAAT 
£iS?cb^S:4 TGATCTGAAT 
*?iiSoS:ox:4^ . TGATOCGAAT 
x'mx^^ls'j TGATOOGAAT 
lmxoxox6 : TGATOOGAAT 

696 



AACTAOCIGT TTTOOOOGOT 

AACTAOCTCT TITOOOGGCT 

AACTAOCIGT TTIOOOGGCT 

AACTACCTGT TTTOaOGOCT 

AACTAOCIXjT TTTGCOGGCrr 

AACTAOCIGT TTTCOOGGOT 

AACTAOCTGT TTICOOGGCT 



CAOAAAAAAT GGKJnOOOO 

CAQAAAAAAT GGTSnGOOO 

CAGAAAAAAT C uiUiiuO OO 

CAGAAAAAAT GuiUi lOOOG 

CAGAAAAAAT GGTCTTGOOG 

CAGAAAAAAT GGIGTrOOOG 

CAGAAAAAAT OCPGTTOOOG 



wtCre OGCCATCTGC 

-mxoxoxl CGOC ATCTGC 

■:mxox6x2 OGCC ATCTGC 

• mxoxox3 COCCATCrcC 

mxoxax4 CGOCATCTGC 

roxoxoxS CGOCATCTGC 

mxoxox6 GOCCATCTGC 

746 

wiCre GAAGCAACrc 

mxoxoxl GAAGCAACTC 

rnxoxoxl GAAGCAACTC 

inxaxox3 GAAGCAACTC 

inxoxox4 GAAGCAACTC 

mxoxoxS GAAGCAACTC 

nixoxox6 GAAGCAACTC 

796 



CACCAGCCAG CTATCAACTC 

CACCAGOCAG CTATCAACTC 

CACCCOOCAG CTATCAACTC 

CACCAGOCGG CTATCAACTC 

CAOCAOOCAG CTATCAACTC 

CACCAGCCAG CTATCAACTC 

CAOCAGOCAG CTATCAACTC 



ATOOATTGAT TTAOOGOOCT 

ATOGATTGAT TTACOGCOCT 

ATOGATTC.\T TTACGGGGCT 

ATCGATTGAT TTACGOCGCT 

ATOGATTCAT TTACCGCOCT 

ATOGATTCAT TTACCGCOCT 

ATOCATtGAT TTACOOCGCr 



COGOOCTOGA AGOG ATnTT 

GOOOOCTOGG AGOG ATtTTT 

OOGOCCTGGG AOOGATTITr 

GOGCOCrGGG AGGG ATmT 

CCGCCCTOGA AOGGATTrrr 

QOOCCCTGOG AGOG ATnTT 

OOQOCCTOGG AOOGAum 
785/ 



AAGGATGACr CrOGTCAGAG 

AAGGATGACT CrOGTtCAGAG 

AAOGATGACT CIOCnCAGAO 

AAGGATGACT CTOCTnCAGAG 

AAGGAIGACT CTGGTCAGAG 

AAGGATGACT CTGCnCAGAG 

AAGGAIGACT CTGGTCAGAG 
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wtCrc 

nxoxox i 

inxOXOx2 

mxoxoxl 

mioxox4 

mxoKOxS 

01X0X0x6 

846 



ATACCTCGCC 
ATACCTCGOC 
ATAOCTOGCC 
ATACXTCGCC 
ATAOCAGCCC 
ATAOCTQGGC 
ATACCAGCCC 



TGClCrCJCAC 

TCcncroCAC 

TOCTrcCOGAC 
TOCrCCXXJAC 

tgctctogac 

TCOICtOGAC 

TGcrcroGAC 



ACAcrccocc 

ACACTCCOCC 
ACACrOCOCG 
ACAG7XXXXX5 

acagtocodo 
acactcoooc 

aCAGRXXXXj 



TCTOCGAGCX: 
TGTOCCACOC 
TGroOCAOOC 
TCrOOC3AOOC 
TGTOGGAOQC 
TGTCGGAGCC 
TGTCGGAOGC 



GOCXDGAGATA 
GCXX3GAGATA 
GOCjOGACATA 

c0cx3gacata 
ooocgacata 
o0c3cxsagata 

OOiOGAGATA 



wiCre 

nixoxoxl 

mxoxox2 

ms.oxox3 

inxoxox4 

inxoxox5 

fnKOXox6 

896 



TOGOOCGCGC 
TGOCXCGCCC 
TGGCOXXXX: 
TGGOOOGOGC 

Tocxxxxxxx: 

TGOOOOCCCC 
TGGOCXGCCC 



TOGAGPnrCA 

TGCAcrrrcA 

TOGA GmC A 

•togactttca 
TOGA Grrrc A 

TOOAGmCA 
TCGACrnCA 



ATACXXWAGA 
ATACCCCAGA 
ATACOjGAGA 
ATACrOGACA 

ataocooaga 
ataoxcaca 

ATACXX30AGA 



TCATGCAAGC 
TCATCCAAGC 
TCATGCAAGC 
TCATGCAAGC 
TCATGCAAGC 
TCATGCAAGC 
TCATQCAACC 



TOCTCGCTOG 
TGGTCGCTGG 
TGGTGGCPCG 
TOGIDOCTCG 
TOGTCOCTCG 
TGCIOOaCG 
TQQTOCSCroO 



wiCrc 
mxoxoxL 

mxqxiax2J. 
■ihxox6x3A 
•.■:inxoxox4; 

1, -^<^^.'j.^?. 



accaatgtaa 
aocaatgtaa 
tocaatgtaa 
accaatgtaa 
acx:aatotaa 
tocaatgtaa 
tocaatgtaa 




ATATIGICAT 
ATATItrrCAT 
ATATTGrCAT 
ATATTCrrCAT 
ATATTGrCAT 
ATATlCnCAT 
ATATTCICAT 



GTGCGOCTGC 
CTCGGOCTGC 
GTGCGCXnGC 
GTOCCOCTGC 
GTGOGOCTXX 
CroCGOCTGC 
GTGCGOCTOC 



CAACTATATC 
CAACTATATC 
CAACTATATC 
CAACTATATC 
CAACTATATC 
CAACTATATC 
CAACTATATC 



TGGAAGATQG 
TOGAAGATGG 
TOGAAGATGG 
TOGAAGATGG 
TGGAAGATQG 
TGGAAGATOG 
TOGAAGATGG 



CGTAACCTOG 
CGrTAACCTGG 
OGTAACCTGG 
OGTAAOCTCG 
OGTAACCTGG 
GGTAACXnCO 
CGTAAOCTGG 



cgattaocca 
cgattagcca 
ccattagcca 
cgatt/.gcca 
cgattagcca 
cgattaocca 

CGATTa.oCCA 
103^ 



atagtgaaac 

ATAGTGAAAC 
ATAGTGAAAC 
ATAGTGAAAC 
ATAGTGAAAC 
ATAGTGAAAC 
ATAGTGAAAC 



TTAAOOCGTA 
TTAACXXXnA 
TTAAaOOCTA 
TTAAOOCGTA 
TTAACXXXTTA 
TTAAOGCCTA 
TTAAOGGGTA 



wiCic - 
mXoxoxl 
. nixbVox2 
inxoxox3 
fnxoxox4 
mxoxQxS 
mxoxoxS 

1046 



aatgataaoc 

AATOATAAGC 

aatgataaoc 

AATCATAAGC 

aatgataaoc 
aatgataaoc 

AATOATAAGC 



TTOGCTblH 
TTOOCTGTrr 
TTOO OGnT 
TTOGCtCrnT 

TTOGcrGrrr 
TTGGcrcrnr 

TTGOCTGnT 



TOGCCGATCA 
TGGOGGATGA 

toqoogatca 
tgocggatga 
tcoogcatoa 
tggoggatga 
tooocgatga 



GAGAAGATrr 

gagaagattt 
gagaagattt 

GAGAAGATIT 
GAGAAGATTT 
GAGAAGATTT 
GAGAAGATTT 



TCAGOCTGAT 
TCAGOCTOAT 
TCAGGCTGAT 
TCAGOCTGAT 
TCAGOCTGAT 
TCAGOCTGAT 
TCAGOCICAT 



%vtCre 


acagattaaa 


TCAGAACGCA 


GA 


mxoxoxl 


acagattaaa 


tcagaaogca 


GA 


inxoxax2 


ACAGATTAA-A 


TCAGAACGCA 


C*A 


mxoxoxB 


acagattaaa 


TCAGAACGCA 


GA 


01X0X0X4 


acagattaaa 


TCAGAACGCA 


GA 


mxoxoxS 


acagattaaa 


TCAGAACGC/i 


GA 


mxoKOxO 


acacattaa.* 


TCAGAACXK>\ 


GA 


1096 









T/..:- -Stop Codon 
T • Point Mutation 
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wtCft 


MSNLUVHQN 


LTALPVDATS 


DEVRKNLMDM 


tnxoxox.1 


MSNLLTVHQN 


LPALPVOATS 


DEVRKNLMAM 


mxoxo&2 




LPALPVOATS 


DEVRKNLMDM 


fnzoxox3 


MSNLLTVHQN 


LPALPIDATS 


DEVRKNU^iDM 


mxoxox4 


MSNLLTVHQN 


LPALPVOATS 


DEVRKNLMDM 


mxoxoxS 


MSNLLTVHQN 


LPALPVOATS 


DEVRKNLMAM 


inxoxox6 


MSNLLTVHQN 


LPALPVOATS 


DEVRKNLMDM 



FRDRQAESei 
FRDRQAESm 
FRORQAesai 
FRDRQAESEti 
FRORQAE^ 
FRDRQAESEbl 
FRDRQAESEbi 



iwKMU^vca 

IWKMLLSVCW 
IWKkdOSVCa 
IWKMLLSVCa 
IWKMLUVCB 

IWKMLLSVCB 

IWKMLLSVCa 



B 



wiCre 

mxoxoxl 

mxoxox2 

mxoxoxl 

nixoxox4 

mxoxoxS 

mxoxox6 



SWAAWCKLNN 
SWAAWOONN 
SWAAWOCLNN 
SWAAWCKLNN 
SWAAWC3CINN 
SWAAWOCLNN 
SWAAWCKLNN 



RKWFPAEPED 
RKWFPAEPED 
RIW'FPAEPED 
RKWFPAEPED 
RKWFPAEPED 
RKWFPAEPED 
RKU'FPAEPED 



VRDYLLYLQA 
VRDYULYLQA 
VRDYLLYLQA 
VRDYaYUQA 
VRDYLLCLQA 
VRDYLLYLQA 
VRDYLLYLQA 



acwaviaiQQ 

QGL&VKDQa 

SGL&VKIIQQ 

aa&VKTIQQ 
RCIAVKIIQQ 

flOAViaiOQ 



HLOQLNMLHB 
HLGQLNMLHB 
HLQQLNMUfa 
HLQQLNfciLHa 
HDQQLNWLHR 
HLGQLNMLHH 
mjGQLNblLHa 



51 



^xgXOxS.'r 
•011X0X9^x4. ".t 
jnxpxoiS • 
.mxoxox6 

101 



BSGLPRPSDS 
BSGLPRPSDS 
BSGLPRPSDS 
BSGLPRPSDS 
BSGLPRPSDS 
QSGLPRPSOS 
BSGLPRPSDS 



NAVSLVMBW 
NAVSLVMBRl 
NAVSLVMBRl 
NAVSLVMBRl 
NAVSLVMBRl 
NAVSLVMBRl 
NAVSLVMBRl 



BlflENVDAGER 
BlflENVDAGER 
BiSENVDAGER 
BKENVDAGER 
RiSENVDAGER 
BKENVDAGER 
Bl^NVDACER 



AISQALAFERT 
£KQALAFERT 
^QALAFGRT 
flSQALAFERT 
£KQALAFKRT 
^J^QALAFERT 
£KQALAFERT 



DFDQVRSLME 
OFDQVRSLME 
DFDQVRSLME 
DFDQVRSLME 
DFDQVRSLME 
DFDQVRSLME 
DFDQVRSLME 



wlCre 

mxoxoxl 

inxoxox2 

tnxoxox3 

mxoxox4 

mxoxoxS 

10X0X0x6 

151 



NSOaCQDlBN 



NSOBCQDIBN 
NSDBCQDIBN 
NSDBCQDIBN 
NSOBCQDIBN 
NSDQCQDIBN 



LAFLGIAYNT 
LAHJGIAYNT 
LAFLOtAYKT 
LAFlXJIAYhfT 
LATlJGlAYKr 
LAFlXltAYKT 
LMU31AVNT 



LLFil&EtARl 
LLF;1&E1ARI 
LLF.1ftEIAR] 
LLf4&EIARI 
LLF4&EIAR1 
LL?A6E1ARI 
LLF.l^ElARI 



RVKDISRTDG 
RVKDISRTNG 
RVKDISRTDG 
RVKDISRTNG 
RVKDISRTDG 
RVKDlSRnXJ 
RVKDISRTDG 



GRMUHIGRT 
GRMUHIGRT 
GRMUHIGRT 
GRMLtHlSRT 
GRMLiHldtT 
GRMUHIGRT 
GRMUHIGRT 



H 
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wtCfC 

miioxoxl 
fnxoxox2 
mxoK0x3 
inxoxox4 
niKOxoxS 
roAOxox6 

201 



KILVSTAGVE 
KTLVSTAGVE 
tfTLYSTACSVE 
IfR-VSTAGVE 
lOI-VSTAGVE 
KTLVSTAGVE 
j^VSTAGVE 



KALSUSVnCL 

KALSLGVnCL 
KAtSUCVTKL 
KALSUCVm. 
KALSUjVTKQ 
KALSLCVTKL 



VHIWISVSGV 
VERWISISGV 
VERWISVSGV 
VEQWISVSCV 
VERWISVSGV 
VERWISVSGV 
VERWISVSGV 



AOOPNNYLPC 
ADOPNNYLFC 
AOOPNNYLFC 
AOOPNNYLPC 
ADDPNNYUC 
ADOPMNYLTC 
ADOPMNYLPC 



ByfliaiGVAAP 
EMBJltiGVAAP 
nvRKN GVAAP 
BVaKMCVAAP 
RVRKNG VAAP 

ByeesiiGVAAp 
BySKuGVAAP 

— 



wtCfC 

mxoxoxl 

mxoxoxi 

inxoxox3 

mxoxox4 

miioxox.5 

inxoxox6 

251 



SATSQLSIHA 
SATSQLSEM 
SATGQLSTBA 
SATSRLSIH& 
SATSQLSTBi 
SATSQLSTBA 
SATSQLSIBA 



LCaFEATHR 
UXilFEATHR 
LGGIFEATHR 
L£GiraATHR 
LGGIFEATHR 
LGGIFEATHR 



UYCAKPOSG 

UYGAKDOSG 
UYGAKDDSG 
UYGAKDOSC 
UYGAKDDSC 
UYGAKDDSG 
tiVGAKDDSG 



QQXlAWafiriS 
QB^WSfiHS 
QB^tAWSGHS 

oaw-AwasKs 

QQXQAWSQHS 
QBXLAW&C^S 
QQIQAWSQKS 



A=.VGAARDMA 
Ar.VGAARDMA 
ASVGAARDMA 
Ar.VGAARDMA 
Ar^VGAARDMA 
AfiVGAARDMA 
AfiVGAARDMA 



K 



wtCxt 

mxoxoxl 

mxoxoxl 

mxoxox3 

inxoxox4 

inxoxox5 

nixoxox6 

301 



RAGVSIFEIM 
RAGVSIPEIM 
RAGVSIPEIM 
RAGVSILEIM 
RAGVSIPEIM 
RAGVSIPEIM 
RAGVSIPBM 



QAOGV'jfTWVNI 
QAGGWTNVNl 
QAGGWSNVNI 
QAGGWTNVNl 
QAGGWTNVNl 
QAGGWSNVNI 
QAGaVSNVNI 



VMN'ilBNLDS 
VMN'iTaNLDS 
VMKilflNLDS 
VMN^siaW-DS 
VKWiTHNLDS 
VMNliaNLDS 
VMK.IBNLDS 



ETGANfVRIJUE 
ETCAMVRLLE 
ETOAMVRLLE 
ETGAMVRIJLE 
EFGAMVRUUE 
EIGAMVRLLE 
ETCAMVRUB 



M 



N 



DGD- 
DGD- 
DGD* 
DGD- 
DGD- 
DGD- 
DGD- 



G - mutated residue 
^U* - catalytic residue 
£ - DNA contacting residue 

- a-helix 

-p sheet 
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Transition 


l^rcquehcy 


Transvcrsion 


Frequency 


A-G 


29 % (9) 


ArC 


6.5 % (2) 


G- A 


26% (8) 


C-A 


< 3 % (0) 


C-T 


6.5 % (2) 


■ A-T 


9.7 % (3) 


T-C 


6.5 % (2) 


T-A 


^,7 % (3) 






C-G 


< 3 % (0) 






G-C 


<3%(0) 






G-T 


6.5 % (2) 






T-G 


< 3 % (0) 



31 point mutations could be identified in the six analyzed mutants. The mutagenic 
frequency can therefore be caJculatcd as 0.5%. No frame-shift mutations, or more than one 
point mutation per codon of the ere sequence were found. The table classifies these 
mutations into the different types of tcansition and transvcrsion events. Given are the 
frequencies and in parenthesis the actual numbers which were found. Interestingly, the A to 
G and vice versa tiansisiton occurred much more often than all other possible events. On 
the other hand, half of the possible transversions were not identified at all. 
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5 


Round 


Induction 


% Kan*^ 


Nb, Kan** 


% White 




0 


4h 


<0.01% 




100% 




1 


4h 


0.16% 


6 


S4% 


10 


2 


4h 


0.56% 


47 


82% 




3 


2.5 h 


0.2% 


36 


80% 


15 


4 


2.5 h 


4.6% 


102 


75% 



20 
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Tabic 14 



PCTAJS0O/O91S4 



Mutant loxKZ' hxJ^ 

wt <0.01% 94% 

R3M1 66% 98% 

R3M2 54% 98% 

R3M3 65% 91% 

R3M4 0.02% 93% 

R3M5 15% 91% 

U3M6 3% 94% 
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CLAIMS 

We claim: 

1 . A method of identiiying variant recombinases that mediate recombinatibn 
at variant recombination sites, the method comprising, 

(a) bringing into contact 

a mutant recombinase, 

a first nucleic acid sequence comprising a first reporter gene and first 
and second recombinaticm sites, wherein the first and second recombination 
sites are variant recombinadon sites, and 

a second nucleic acid sequence comprising a second reporter gene 
and third and fourth recombination sites, wherein the third and fourth 
recombination sites can be recombined by a non-mutant recombinase, 

(b) determining if recombiriation occurs between the first and second 
recotiibiDation sites, and detemuning if recombination occurs between the third and 
fourth recombination sites, 

wherein recombination between the first and second recombination sites 
indicates that the mutant recombinase is a variant recombinase that mediates 
recombination at variant recombination sites, 

wherein recombination between the third and fourdi recombination sites 
indicates that the mutant recombinase retains the ability to mediate recombination at 
non-variant recombiiiation sites. 

2. The metiiod of claim 1 v^erein the recombination sites comprise 
recognition sequences and compatibility sequences, 

wfaearein the recognition sequences of the first and second recombination ^tes 
differ from the recognition sequences of the third and fourth recombination sites, 

wherein the compatibility sequences of the first and second recombination 
sites are sufficiently similar to allow recomlnnation between the first and second 
recombination sites, and wherein the compsatibility sequences of the third and fourth 
recombination sites are sufficientiy similar to allow recombination be^veen the third 
and fourth recombination sites, and 

wherein the compatibility sequences of the first and second recombination ^ 
sites differ from the compatibility sequences of the third and fourth recombination 
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sites such that neither the first nor the second recombination site can be recombined 
with ei^er the third or the fourth recombination site. 

3. The method of claim 1 wherein the first and second recombination ^tes 
. cannot be recombined by non-mutant recombinase to a significant extent 

4. The method of claim 1 or 2 wherein the first and second recombination 
sites have identical sequences, and wherein the third and fourth recombination sites 
have identical sequences. 

5. The method of claim 1 wherein recombination between the first and 
second recombinadon ates alters the expression of the first reporter gene, >*iierein 
recombination between the first and second recombination sites is detennined by 
determining if expression of the first reporter gene is altered, and 

wherein recombination between the third and fourth recombinatibn sites 
alters the expression of the second reporter gene, wherein recombination between 
the ttdrd and fourth recombination sites is detennined by determining if expression 
of the second reporter gene is altered. 

6. The method of claim 5 wherein recombination between the first and 
second recombination sites allows the first reporter gene to be expressed. 

7. The meAod of claim 6 wherein the first nucleic acid sequence fiirther 
comprises a spacer sequence flanked by the first and second recombination sites, 
wherein the spacer sequence intermpts the first reporter gene such that the first 
reporter gene is not expressed, wherein recombination of the first and second 
recombination sites excises the spacer sequence which allows the first reporter gene 
to be expressed. 

8. The method of claim 6 wherein a portion of the first reporter gene is 
inverted, wherein the inverted portion of the first reporter gene is flanked by the first 
and second recombination sites, wherein recombination of the first and second 
recombination sites inverts the inverted portion of the first reporter gene which 
allows the first reporter gene to be expressed. 

9. The method of claim 5 wherein recombination between the first and 
second recombination sites prevents expression of the first reporter gene. 

1 0. The method of claim 9 wherein the fust reporter gene is flanked by the 
first and second recombination sites, wherein recombination of the first and second 
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recombination sites excises the first reporter gene which prevents expression of the 
fiist reporter gene. 

1 L The method of claim 9 wherein a portion of the first reporter gene is 
flanked by the first and second recombmatioii sites, v**erein recombmation of the 
first and second recombination ates inverts the flanked portion of the first reporter 
gene which prevents expression of the first reporter gene. 

1 2. The method of claim 5 wherein recombination between the third and 
fourth recombination sites allows the second reporter gene to be expressed. 

13. The method of claim 12 wherem the second nucleic add sequence 
fiirther comprises a spacer sequence flanked by tiie third and fourth recombination 
sites, wdierein the spacer sequence inten^pts the second reporter gene such that the 
second reporter gene is not expressed, wherein recombination of the third and fourth 
recombination sites excises the spacer sequence vAdch allows the second reporter 
gene to be expressed. 

14. The method of claim 13 wherein the spacer sequence interrupts the 
second reporter gene such that the second reporter gene is not transcribed. 

15. The method of claim 13 wherein the second reporter gene encodes a 
protein, wherein the spacer sequence interrupts the seccmd reporter gene such that 
the protein encoded by the second reporter gene is not translated. 

16. The method of claim 13 wherein the spacer sequence interrupts the 
second reporter gene such that the second reporter gene produces aii inactive 

expression product 

17. The method of claim 12 wherem a portion ofthe second reporter gene is 
inverted, wherein the mverted portion ofthe second reporter gene is flanked by the 
third and fourth recombination sites, wherem recombination of the tiiird and fourth 
recombination sites inverts tiie inverted portion of tiie second reporter gene which 
allows the second reporter gene to be expressed. 

18. The metiiod of claim 5 wherein recombination between the third and 
fourth lecomlrination sites prevents expression ofthe second reporter gene to be 



19. The method of claim 18 wherein die second reporter gene is flanked by 
the tiiird and fourth recombination sites, wherein recombination ofthe tiiird and 
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fourth recombination sites excises the second reporter gene which prevents 
expression of the second reporter gene. 

20. The method of claim 1 8 wherein a portion of the second reporter gene is 
flanked by the third and fourth recombination sites, wherein recombination of the 
third and fourth lecombination ates inverts the flanked portion of the second 
reporter gene which prevents expression of the second reporter gene. 

21- The method of claim 1 wherein the first nucleic acid sequence is a first 
nucleic acid construct and the second nucleic acid sequence is on a second nucleic 
acid construct 

22. Tlie method of claim 21 wherein the first nucleic acid construct is an 
extiachromosomal vector and the second nucleic acid construct is in the genonie of a 
host cell. 

23. The method of clwrn 1 wherein the first and second nucleic acid 
constructs are on the same nucleic acid construct. 

24. A method for jHoducing ate-specific recombination of DNA, 
comprising, 

contacting a variant recombinase identified by the method of claim 1 with 
first and second DNA sequences, 

wherein the first DNA sequence comprises a first recombination site and the 
second DNA sequence comprises a second recombination site, 

\A^erein the variant recombinase mediates recombination between the first 
and second recombination sites thereby producing the site specific recombination. 

25. The method of claim 24 wherein the first recombination site, the second 
recombination site, or both, are variant recombination sites. 

26. The method of claim 24, wherein the first and second DNA seqtiences 
are coimected by a pre-se!ected DNA segment. 

27. The method of claim 26, wherein the first and second recombination 
sites have the same orientation and the site-specific recombination of DNA is a 
deletion of the pre-selected DNA segment. 

28. The method of claim 27, wherein the pre-selected DNA segment is a 
gene for a structural protein, an enzyme, or a regulatory molecule. 

29- The method ofclaim 27 further comprising contacting the variant 
recombinase with a fourth DNA sequence comprising a third recombination ate, 
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wherein the second and fourth DNA sequences arc connected by a second pre- 
selected DNA segment. 

30. The method of claim 29 wherein the first recombination site is a variant 
recombination site recognized by the variant recombinase and not by wild type 
recombinase, and wherein the second and third recombination sites are 
recombination sites recognized by wild type recombinase and by the variant 
lecombinase. 

3 1 . The method of claim 30 further comprising, prior to contacting the 
variant recombinase witii die first, second^ and tlurd recombination sites, contacting 
the recombination sites with Vvild type recombinase, thereby producing site specific 
recombination between the second and third recombination sites resulting in a 
deletion of the second pre-selected DNA segment. 

32. The method of claim 29, wherein the second pre-selected DNA segment 
is a gene for a structural protein, an enzyme, or a regulatory molecule. 

33. The meihod of daim 26, wherdn the first and second recombination 
sites have opposite orientations and the site-specific recombination is an inversion of 
the nucleotide sequence of the pre-selected DNA segment 

34. The method of claim 33, wherein the first and second recombination 
sites are variant recombination sites recognized by the variant recombinase. 

35. The method of claim 33, wherein the pre-selectcd DNA segment is a 
gene for a structural protein, an enzyme, or a regulatory molecule. 

36. The method of claim 24, wherein the second and third DNA sequences 
are introduced into two different DNA molecules and the ate-specific recombination 
is a reciprocal exchange of DNA segments proximate to the recombination sites. 

37. The method of claim 36, vAierein the first and second recombination 
sites are variant recombination sites recognized by the variant recombinase. 

38. The method of claim 24 wherein the second DNA sequence includes a 
label, wherein recombination betwe^ the first and second recombination sites 
associates tibe label \snth the first DNA sequence. 

39. The method of claun 38 wherein the first DNA sequence is a large 
circular DNA molecule. 

40. The method of claim 24 wherein recombination occurs in a cell. 
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41 . The method of claim 40 wherein the variant recombinase is contacted 
with the first and second DNA sequences by introducing into the cell a third DNA 
sequence comprising DNA encoding the variant recombinase. 

42. The niethodofclaini 41, \\iierein the third DNA sequence further 
comprises a regulatory nucleotide sequence and expression of the variant 
recombinase is produced by activating the regulatory nucleotide sequence. 

43. The method of claim 40, wherein the cell is a eukaryotic cell, a 
manunaliaa cell, a yeast cell, a fungal cell, a prokaryotic cell, a bacterial cell, an 
archae bacterial cell, or a cell in a multicellular organism. 

44. The method of claim 43 wherein the multicellular organism is a plant, aji 
animal, or a manmial. 

45. The method of claim 40, wherein the first and second D>1A sequences 
are connected by a pre-selected DNA segment, wherein the iirst and second 
lecombination sites have the same orientation and the site-specific recombination of 
DNA is a deletion of the pre-selected DNA segment 

46. The method of claim 45 wherein the cell is in a multicellular organism. 

47. The method of claim 45, wherein the pre-selected segment is an 
undesired marker or trait gene. 

48. The method of claim 24 wherein the variant recombinase is contacted 
with the recombination sites in vitro, 

49. The method of claim 48 wherein the method further comprises 
introducing the rccorabined DNA into a cell. 

50. A niethod for cloning large DNA fragments, the method comprising 
concatenating DNA fragments to be cloned with vector arms, wherein each 

vector aim conqxrises a reccMnbination site, wherein the DNA fragments and vector 
arms alternate in the concatemer, 

introducing the concatemer into a cell expressing a variant recombinase 
ideiidfied by the method of claim 1 , wherein the recombinase inediates 
recombination of the recombination sites thereby generating circles each containing 
a DNA fragment and a vector arm. 

51. A method for cloning large DNA fragments, the method comprising 
ligadng a DNA fragment to be cloned to vector arms, wherein each vector 

arm comprises (i) a blunt end, (ii) another end which is compatible with an end of 
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the DN A fragment to be cloned, and (iii) a recombination site, wherein concatemers 
are not formed, and 

introducing the ligated DN A fragment and vector arms into a cell expressing 
a variant reoombinase identified by the method of claim 1 . 

52. A method for cloning large DNA fragments, the method comprising 
ligating a plurality of DNA fragments to be cloned with a jrfurality of first 
and second vector arms, wherein each first vector arm comprises two ligatable ends, 
wherein each second vector arm comprises a recombination site and one non- 
ligatable end, ^ 

wherein, following ligation, the DNA fragments and first vector aims 
alternate in concatemers, wherein the concatemers are flanked by second vector 
arms, 

introducing the concatemers into a cell expressing a variant recombinase 
identified by the method of claim 1, wherein the recombinase mediates 
lecombinatidn of the recombination sites thereby generating circles conlainmg the 
DNA fragments. 

53. A variant recombinase identified by the method of claim 1 . 

54. A nucleic acid molecule encoding a variant recombinase identified by 
the method of claim 1 . 

55. The nucleic acid molecule of claim 54 wherein the nucleic acid molecule 
is a plasmid. 

56. A cell containing the nucleic acid molecule of claim 54. 

57. The cell of claim 56 wherein the cell is a cukaryotic cell, a mammalian 
cell, a yeast cell, a fungal cell, a prokaryotic cell, a bacterial cell, an archae bacterial 
cell, or a cell in a multicellular organism. 

58. The cell of claim 57 wherein the multicellular organism is a plant, an 
animal, or a mammal. 

59. A nucleic acid molecule having at least me variant recombination site, 
wherdn the variant recombination site is recognized by a variant recombinase 
identified by the method of claim 1 and is not recognized by wild type recombinase. 

60. The nucleic acid molecule of claim 59 wherein the recombination site is 
not recognized by wild type recombinase. 
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61. The nucleic acid molecule of claim 59, wherein a first recombination site 
and a second recombination site are connected by a pre-selected DNA segment. 

62. The nucleic acid molecule of claim 59 wherein the nucleic acid molecule 
is a plaanid. 

63. A cell contsuning the nucleic acid molecule of claim 59. 
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